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Description 

The present invention relates to an oxygen-regulated protein 150 (ORP150). Specifically, the invention relates to 
the amino acid sequence of such ORP150 polypeptides, polynucleotides encoding ORP150 polypeptides, promoters 
5 of ORP150 genes and antibodies specific to ORP1 50 polypeptides. 

Since the expression of a 70 kDa heat shock protein (HPS70) in cerebral ischemic lesions was reported for the first 
time, various stress proteins, represented by HSP70, have been reported to be expressed in myocardial ischemic and 
atherosclerotic lesions, as well as cerebral ischemic lesions. The fact that the induction of HSP, a mechanism of defence 
against heat stress, is seen in ischemic lesions, suggests that the stress response of the body to ischemic hypoxia is 
10 an active phenomenon involving protein neogenesis. Regarding cultured cells, stressful situations that cause ischemia 
in vivo, such as hypoglycemia and hypoxia, have been shown to induce a group of non-HSP stress proteins, such as 
glucose-regulated protein (GRP) and oxygen-regulated protein (ORP). 

ORP is therefore expected to serve in the diagnosis and treatment of ischemic diseases. 

Hori et al. have recently found that exposure of cultured rat astrocytes to hypoxic conditions induces 150, 94, 78, 
is 33 and 28 kDa proteins [J. Neurochem., 66, 973-979(1996)]. These proteins, other than the 150 kDa protein, were iden- 
tified as GRP94, GRP78, hemoxygenase 1 and HSP28. respectively, while the 150 kDa protein (rat ORP150) remains 
not to be identified. In addition, there has been no report of human ORP150 protein. 

Accordingly, the technical problem underlying the present invention is to provide ORP150 proteins, namely those 
of human and rat origin, the amino acid sequences of these proteins as well as nucleotide sequences encoding these 
20 proteins, the promoter regions of the corresponding genes and antibodies against ORP150 proteins or fragments 
thereof which are useful in the diagnosis and treatment of ischemic diseases. 

This technical problem has been solved by the provision of the embodiments characterized in the claims. 

Thus, in a first aspect, the present invention relates to a polynucleotide encoding an ORP150 polypeptide selected 
from the group consisting of: 

25 

(a) polynucleotides encoding the polypeptide having the amino acid sequence as depicted in SEQ ID NO:1 or a 
fragment of the polypeptide; 

(b) polynucleotides comprising the coding region of the nucleotide sequence as shown in SEQ ID NO:2 or a frag- 
ment thereof; 

30 (c) polynucleotides encocfing the polypeptide having the amino acid sequence as depicted in SEQ ID NO:3 or a 
fragment of the polypeptide; 

(d) polynucleotides comprising the coding region of the nucleotide sequence as depicted in SEQ ID NO:4 or a frag- 
ment thereof; 

(e) polynucleotides encoding an ORP 150 polypeptide which differs from the polypeptide encoded by the polynude- 
35 otide of (a) or (c) due to deletions). additions), insertions) and/or substitutions (s) of one or more amino acid res- 
idues; and 

(f) polynucleotides the complementary strand of which hybridizes to a polynucleotide of any one of (a) to (e) and 
which encode an ORP150 polypeptide; 

40 and the complementary strand of such a polynucleotide. 

In still another embodiment, the present invention relates to a polynucleotide capable of hybridizing to the above 
polynucleotide or a fragment thereof and having promoter activity. 

In still another embodiment, the present invention relates to a recombinant DNA, e.g. vectors, which contains a 
nucleotide sequence of the present invention. 
45 In still another embodiment, the present invention relates to an expression vector which contains the recombinant 
DNA of the present invention, to host cells transformed with polynucleotides or vectors of the invention and to a process 
for the production of an ORP150 protein by cultivating such host cells. In a further embodiment, the present invention 
relates to the polypeptides encoded by the polynucleotides of the invention. 

In still another embodiment, the present invention relates to an antibody or fragment thereof which specifically 
so binds to the polypeptide of the present invention, and to nucleic acid molecules which specifically hybridize to polynu- 
cleotides of the present invention. 

In still another embodiment the present invention relates to pharmaceutical and diagnostic compositions compris- 
ing the above-described polynucleotides, polypeptides, antibodies and/or nucleic acid molecules. 

Figure 1 indicates a schematic diagram of the exon-intron structure of the human ORP gene. Black squares repre- 
ss sent the exons. 

Figure 2 shows the results of the Northern blot analysis of ORP150 mRNA extracted from human astrocytoma 
U373 ceils after exposure to various types of stress. 

Figure 3 shows the results of the Northern blot analysis of ORP150 mRNA from adult human tissues. 

One embodiment of a polynucleotide of the present invention is a polynucleotide encoding a polypeptide compris- 
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ing the amino acid sequence shown by SEQ ID N0:1 in the sequence listing, and constituting the human oxygen-reg- 
ulated protein ORP150 which is obtainable by inducement under hypoxic conditions. Another embodiment of a 
polynucleotide of the present invention is a polynucleotide encoding a polypeptide comprising the amino acid sequence 
shown by SEQ ID NO: 3 in the sequence listing, and constituting the rat oxygen-regulated protein ORP150 which is 

5 obtainable by inducement under hypoxic conditions. The polynucleotides of the present invention also include those 
which code for polypeptides each comprising a portion of the above-described polypeptides, and those encoding the 
entire or portion of the above-described polypeptides. It is a well-known fact that mutation occurs in nature; some of the 
amino acids of ORP150 protein may be replaced or deleted, and other amino acids may be added or inserted. Mutation 
can also be induced by gene engineering technology. It is therefore to be understood that substantially homologous 

10 polypeptides resulting from such mutations in one or more amino acid residues are also included in the scope of the 
present invention as long as they are obtainable by inducement under hypoxic conditions. 

Further embodiments of a polynucleotide of the present invention are polynucleotides comprising the nucleotide 
sequence shown by SEQ ID N02 in the sequence listing, i.e., human ORP150 cDNA and polynucleotides comprising 
the nucleotide sequence shown by SEQ ID NO:4 in the sequence listing which represents rat ORP150 cDNA. Polynu- 

15 cleotides comprising a portion of these polynucleotides, and those containing the entire or portion of these polynucle- 
otides are also included in the scope of the present invention. As stated above, the ORP150 gene may have some 
bases replaced, deleted, added or inserted by mutations, and the resulting polynucleotides with partially different nucle- 
otide sequences are also included in the scope of the present invention, as long as they are substantially homologous 
and encode a polypeptide obtainable by inducement under hypoxic conditions. 

20 The present invention also relates to a polynucleotide the complementary strand of which hybridizes to a polynu- 
cleotide as described above and which codes for an ORP150 polypeptide, this means for a polypeptide inducible under 
hypoxic conditions. "Hybridizing" in this regard means preferably hybridization under stringent conditions. The hybridiz- 
ing polynucleotides have preferably a sequence identity of at least 50% most preferably of at least 70%, with the poly- 
nucleotides described above. The term "stringent concfitions ,, means that hybridization will occur only if there is at least 

25 95% and preferably at least 97% identity between the sequences. 

The polynucleotides of the present invention may be RNA or DNA molecules. DNA molecules can, for example, be 
cDNA, genomic DNA, double or single stranded DNA, isolated from natural sources, produced in vitro or by chemical 
synthesis methods. The polynucleotides of the invention can code for an ORP150 polypeptide from any organism 
expressing such a polypeptide, preferably from eukaryots, for example, insects, vertebrates, preferably mammals and 

30 most preferably from human, rat, mouse, bovine, sheep, goat or pig. 

Furthermore, the present invention also relates to recombinant nucleic acid molecules which comprise a polynu- 
cleotide according to the invention. Examples for such molecules are vectors, namely plasmids, cosmids, phagemids, 
recombinant phages, viruses etc. 

In a preferred embodiment the polynucleotide according to the invention present in such a recombinant nucleic acid 

35 molecule is linked to regulatory elements which allow for expression in prokaryotic or eukaryotic host cells. Such regu- 
latory elements are well known in the art and include promoters, transcriptional and translational enhancers and the 
like. 

The term "recombinant DNA" as used herein is defined as any DNA containing a polynucleotide described above. 

The term "expression vector" as used herein is defined as any vector containing the recombinant DNA of the 
40 present invention and expressing a desired protein by introduction into the appropriate host. 

The term "clone" as used herein means not only a cell into which a polynucleotide of interest has been introduced 
but also the polynucleotide of interest itself. 

The term "inducement under hypoxic conditions" used herein means an increase in protein synthesis upon expos- 
ing cells to an oxygen-depleted atmosphere. 
45 The present invention furthermore relates to host cells transformed and genetically engineered with a polynucle- 
otide according to the invention. These may be prokaryotic or eukaryotic ells. They may be homologous or heterologous 
with respect to the introduced polynucleotide. If they are homologous they can be distinguished from naturally occurring 
cells by the feature that they comprise in addition to a naturally occurring ORP 1 50 gene, at least one further copy of an 
ORP150 coding region which is integrated into the genome in a position in which it does normally not occur. This can 
so be confirmed, e.g., by Southern blotting. Suitable host cells include, for example, bacteria such as E. coli and Bacillus 
subtilis. yeast such as S. cerevisiae. vertebrate cells, insect cells, mammalian cells, e.g. rat. mouse or human cells. 

Moreover, the present invention relates to a process for the production of an ORP 150 polypeptide which comprises 
the steps of culturing the host according to the invention and recovering the produced polypeptide from the cells and/or 
the culture medium. 

55 The present invention also relates to the polypeptides encoded by the polynucleotides according to the invention 
or obtainable by the above described process. 

The amino acid sequences and nucleotide sequences of the present invention can, for example, be determined as 
follows: First poly(A) + RNA is prepared from rat astrocytes exposed to hypoxic conditions. After cDNA is synthesized 
from sad poly(A) + RNA using random hexamer primers, a cDNA Itorary is prepared using the pSPORTI vector (pro- 
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duced by Life Technology), or the like. 

Next, PCR is conducted using oligonucleotide primers synthesized on the basis of the nucleotide sequence of the 
pSPORTI vector used to prepare the cDNA library above and the degenerate nucleotide sequences deduced from the 
N-terminal amino acid sequence of purified rat ORP150, to yield a large number of amplified DNA fragments. These 

5 DNA fragments are then inserted into the pT7 Blue vector (produced by Novagen), or the like, for cloning to obtain a 
clone having nucleotide sequence which perfectly encodes the N-terminal amino acid sequence. Purification of 
ORP150 can be achieved by commonly used methods of protein purification, such as column chromatography and 
electrophoresis, in combination as appropriate. 

In addition, by screening the above-described rat astrocyte cDNA library by colony hybridization using the insert in 
10 above clone as a probe, a clone having an insert thought to encode rat ORP150 can be obtained. This clone is sub- 
jected to stepwise deletion from both the 5'- and 3-ends, and oligonucleotide primers prepared from determined nucle- 
otide sequences are used to determine the nucleotide sequence sequentially. If the clone thus obtained does not 
encode the full length of rat ORP150, an oligonucleotide probe is synthesized on the basis of the nucleotide sequence 
of the 5'- or 3 -region of the insert, followed by screening for a clone containing the nucleotide sequence extended fur- 
's ther in the S or 3* direction, for example, the Gene Trapper cDNA Positive Selection System Kit (produced by Life Tech- 
nology) based on hybridization using magnetic beads. The full-length cDNA of the rat ORP150 gene is thus obtained. 

Separately, the following procedure is followed to obtain a human homologue of rat ORP150 cDNA. Poly(A) + RNA 
is prepared from the human astrocytoma U373 exposed to hypoxic conditions. After cDNA is synthesized from said 
poly(A)*RNA using random hexamer primers and an ofigo(dT) primer, said cDNA is inserted into the EcoRI site of the 

6 pSPORTI vector to prepare a cDNA library. Human ORP150 cDNA is then obtained using the Gene Trapper Kit and 
the nucleotide sequence is determined in the same manner as with rat ORP150 above. 

The nucleotide sequence of human ORP150 cDNA is thus determined as that shown by SEQ ID NO:2 in the 
sequence listing, based on which the amino acid sequence of human ORP150 is determined. 

Exposure of astrocytes to hypoxic conditions can, for example, be achieved by the method of Ogawa et al. [Ogawa, 
25 S . Gerlach. H . Esposito, C. Mucaulay, A.P., Brett, J., and Stern, D.. J. Clin. Invest., 85, 1090-1098 (1990)]. 

Furthermore, the following procedure is followed to obtain human ORP150 genomic DNA. A genomic library pur- 
chased from Clontech (derived from human placenta, Cat. #HL1 067 J) is used. Screening is conducted by hybridization 
using a DNA fragment consisting of 202 bp of the 5* untranslated region and 369 bp of the coding region, derived from 
the rat cDNA done, as welt as a 1351 bp DNA fragment containing the termination codon, derived from the human 
30 cDNA, as probes. Two clones containing the ORP150 gene are isolated, one containing exons 1 through 24 and the 
other containing exons 16 through 26; the entire ORP150 gene is composed by combining these two clones. The nucle- 
otide sequence of the 1 5851 bp human ORP150 genomic DNA is determined; its nucleotide sequence from the 5'-end 
to just before the translation initiation codon ATG in exon 2 is shown by SEQ ID NO: 12 in the sequence listing. 

As stated above, the present invention includes polypeptides containing the entire or portion of the polypeptide 
35 (human ORP1 50) having the amino acid sequence shown by SEQ ID NO: 1 in the sequence listing. The present inven- 
tion also includes the entire or portion of the polypeptide having the amino acid sequence shown by SEQ ID NO:1 in 
the sequence listing; for example, polynucleotides containing the entire or portion of the nucleotide sequence shown by 
SEQ ID NO:2 in the sequence listing are included in the scope of the present invention. The present invention also 
includes specific antibodies against these polypeptides of the present invention, and fragments thereof. 
40 An antibody against a polypeptide of the present invention, which polypeptide contains the entire or portion of 
human or rat ORP150, can be prepared by a conventional method [Current Protocols in Immunology, Coligan, J.E. et 
al. eds.. 2.4.1-2.4.7, John Wiley & Sons, New York (1991)]. Specifically, a rat ORP150 band, separated by, for example, 
SDS-polyacrytamide gel electrophoresis, is cut out and given to a rabbit etc. for immunization, after which blood is col- 
lected from the immunized animal to obtain an antiserum. An IgG fraction can be obtained if necessary by affinity chro- 
45 matography using immobilized protein A, or the like. A peptide identical to the partial amino acid sequence of ORP150 
can be chemically synthesized as a multiple antigen peptide (MAP) [Tarn, J. P., Proc. Natl. Acad. Sci. USA, 85, 5409- 
5413 (1988)], and can be used for immunization in the same manner as above. 

It is also possible to prepare a monoclonal antibody by a conventional method [Cell & Tissue Culture; Laboratory 
Procedure (Doyle, A. et al., eds.) 25A:1~25C:4, John Wiley & Sons, New York (1994)] using a polypeptide containing 
so the entire or portion of human or rat ORP150 as an antigen. Specifically, a hybridoma is prepared by fusing mouse 
splenocytes immunized with said antigen and a myeloma cell line, and the resulting hybridoma is cultured or intraperi- 
toneally transplanted to the mouse to produce a monoclonal antibody. 

The fragments resulting from protease digestion of these antibodies as purified can also be used as antibodies of 
the present invention. 

55 The present invention also relates to nucleic acid molecules which specifically hybridize with a polynucleotide 
according to the invention or with the complementary strand of such a polynucleotide. "Specifically hybridizing" means 
that such molecules show no significant cross-hybridization to polynucleotides coding for proteins other than an 
ORP 1 50 polypeptide. Preferably these nucleic acid molecules have a length of at least 1 5 nucleotides, more preferably 
of at least 30 nucleotides and most preferably of at least 50 nucleotides. In a preferred embodiment these molecules 
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have over their entire length a sequence identity to a corresponding region of a polynucleotide of the invention of at least 
85%, preferably of at least 90% and most preferably of at least 95%. In a particularly preferred embodiment the 
sequence identity is at least 97%, These nucleic acid molecules can be used, for example, as hybridization probes for 
the isolation of related genes, as PCR primers, for the diagnosis of mutations of ORP1 50 genes, for the use in antisense 

5 molecules or ribozymes or the like. 

The polynucleotides of the present invention, the polypeptides encoded by them, specific antibodies against these 
polypeptides or fragments thereof and the nucleic acid molecules specifically hybridizing to the above-mentioned poly- 
nucleotides are useful in the diagnosis and treatment of ischemic diseases, permitting utilization for the development of 
therapeutic drugs for ischemic diseases. 

10 Thus, the present invention also relates to a pharmaceutical composition comprising a polynucleotide, polypeptide, 
antibody and/or nucleic acid molecule according to the invention. Optionally, such a composition also comprises a phar- 
maceuticalfy acceptable carrier. 

The invention also relates to diagnostic composition comprising a polynucleotide, polypeptide, antibody and/or 
nucleic acid molecule according to the invention. 

is In another embodiment the present invention relates to a polynucleotide comprising or containing the entire or por- 
tion of the nucleotide sequence shown by SEQ ID IMO:12 in the sequence listing. This is a polynucleotide containing the 
promoter region of the human ORP150 gene. Polynucleotides capable of hybridizing to this polynucleotide under con- 
ventional hybridizing conditions (e.g., in 0.1 x SSC containing 0.1% SDS at 65°C) and possessing promoter activity are 
also included in the scope of the present invention. Preferably, such a promoter is able to promote transcription in cells 

20 when exposed to hypoxia. Successful cloning of said promoter region would dramatically advance the functional anal- 
ysis of the human ORP1 50 gene and facilitate its application to the treatment of ischemic diseases. 

The term "promoter" as used herein is defined as a polynucleotide comprising a nucleotide sequence that activates 
or suppresses the transcription of a desired gene by being present upstream or downstream of said gene. 

The following examples illustrate the present invention ; 

25 

Example 1 

Cell culture and achievement of hypoxic condition 

30 Rat primary astrocytes and microglia were obtained from neonatal rats by a modification of a previously described 
method [Maeda, Y, Matsumoto. M., Ohtsuki, T, Kuwabara, K., Ogawa, &, Hori, O., Shui, D.Y., Kinoshita, T., Kamada, 
T. t and Stern, D., J. Exp. Med., 180, 2297-2308(1994)]. Briefly, cerebral hemispheres were harvested from neonatal 
Sprague-Dawley rats within 24 hours after birth, meninges were carefully removed, and brain tissue was digested at 
37°C in minimal essential medium (MEM) with JoWik's modification (Gibco, Boston MA) containing Dispase II (3mg/ml; 

35 Boehringer-Mannheim, Germany). After centrifugation, the cell pellet was resuspended and grown in MEM supple- 
mented with fetal calf serum (FCS; 10%; CellGrow, MA). 

After 10 days, cytosine arabinofuranoside (10ng/ml; Wako Chemicals, Osaka, Japan) was added for 48 hours to 
prevent fibroblast overgrowth, and culture flasks were agitated on a shaking platform. Then, floating cells were aspi- 
rated (these were microglia), and the adherent cell population was identified by morphological criteria and immunohis- 

40 tochemical staining with anti-glial f torillary acidic protein antibody. Cultures used for experiments were >98% astrocytes 
based on these techniques. 

Human astrocytoma cell line U373 was obtained from American Type Culture Collection (ATCC) and cultured in 
Dulbecco's modified Eagle medium (produced by Life Technology) supplemented with 10% FCS. 

Cells were plated at a density of about 5 X 10 4 cells /cm 2 in the above medium. When cultures achieved conflu- 
45 ence, they were exposed to hypoxia using an incubator attached to a hypoxia chamber which maintained a humidified 
atmosphere with low oxygen tension (Coy Laboratory Products, Ann Arbor Ml) as described previously [Ogawa. S., 
Gerlach, H., Esposito, C, Macaulay, A. P., Brett, J., and Stern, D., J. Clin. Invest.. 85. 1090-1098 (1990)]. 

Example 2 

50 

Purification and N-terminal sequencing of the rat 1 50 kDa polypeptide 

Rat primary astrocytes (about 5 x 10 8 cells) exposed to hypoxia for 48 hours were harvested, ceils were washed 
three times with PBS(pH 7.0) and protein was extracted with PBS containing NP-40 (1%). PMSF (1mM), and EDTA 
55 (5mM). Extracts were then filtered (0.45 ixm nitrocellulose membrane), and either subjected to reduced SDS-PAGE 
(7.5%. about 25ng) or 2-3 mg of protein was diluted with 50 ml of PBS (pH 7.0) containing NP-40(0.05%) and EDTA 
(5mM), and applied to FPLC Mono Q(bed volume 5 ml, Pharmacia, Sweden). 

The column was washed with 0.2M NaCI. eluted with an ascending salt gradient (0.2 to 1 .8 M Nad) and 10 of 
each fraction (0.5 ml) was applied to reduced SDS-PAGE (7.5%), along with molecular weight markers (Biorad). Pro- 
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teins in the gel were visualized by silver staining. Fractions eluted from FPLC Mono Q which contained the 150 kDa 
polypeptide (#7*8) were pooled and concentrated by ultrafiltration (Amicon) 50-fold and about 200 jig of protein was 
applied to preparative, reduced SDS-PAGE (7.5%) . Following electrophoresis, proteins in the gel were transferred elec- 
trophoretically (2A/cm 2 ) to polyvinylidene dif luoride (PVDF) paper (Millipore, Tokyo), the paper was dried, stained with 
5 Coomassie Brilliant blue, and the band corresponding to 150 kDa protein (OPR150) was cut out for N-terminal 
sequencing using an automated peptide sequencing system (Applied Biosystems. Perkin-Elmer). The N-terminal 31- 
amino acid sequence was thus determined (SEQ ID NO:5). 

Example 3 

10 

Preparation of rat astrocyte cDNA library 

Total RNA was prepared from rat primary astrocytes (1.1 x 10 8 cells), in which ORP150 had been induced under 
hypoxic conditions, by the acid guanidinium-phenol-chioroform method [Chomczynski, P. and Sacchi, N., Anal. Bio- 
is chem., 162, 156-159 (1987)]. Using 300 \lq of the total RNA obtained, purification was conducted twice in accordance 
with the protocol for poly(A) + RNA purification using oligo(dT)-magnetic beads (produced by Perceptive Diagnostics), 
to yield poly(A) + RNA. Double-stranded cDNA was then synthesized using random hexamer primers, in accordance 
with the protocol for the Superscript Choice System (produced by Life Technology), and inserted into the EcoRI site of 
the pSPORTI vector to prepare a cDNA library consisting of 5.4 x 10 5 independent clones. 

20 

Example 4 

doping of ret OPP1?0 PDNA 

25 Rat ORP150 cDNA was cloned as follows: First, to obtain a probe for colony hybridization, the cDNA library was 
subjected to PCR using a 20-base primer. S'-AATACGACTCACTATAGGGA-S' (SEQ ID NO:6). which corresponds to the 
antisense strand of the T7 promoter region in the pSPORTI vector, and 20 base mixed primers. 5'AARCCiGGiGT- 
NCCN ATGGA-3' (SEQ ID NO:8), which contains inosine residues and degenerate polynucleotides and which was pre- 
pared on the basis of the oligonucleotide sequence deduced from a partial sequence (KPGVPME) (SEQ ID NO:7) 

30 within the N-terminal amino acid sequence (LAVMSVDLGSESMKVAIVKPGVPMEIVLNKE) (SEQ ID NO:5); the result- 
ing PCR product with a length of about 480 bp was inserted into the pT7 Blue Plasmid vector. Nucleotide sequences of 
the clones containing an insert of the expected size (480 bp) corresponding to the PCR product were determined using 
an automatic nucleotide sequencer (produced by Perkin-Elmer, Applied Biosystems). A clone containing a 39-nude- 
otide sequence encoding a peptide identical to the rat ORP1 50-specif ic amino acid sequence KPGVPMEIVLNKE (SEQ 

35 ID NO:9) in the insert was thus obtained. 

Using the above insert of the clone as a probe, RNA from cultured rat astrocytes were subjected to Northern blot- 
ting; the results demonstrated that mRNA with a length of about 4 Kb was induced by hypoxic treatment Thereupon, 
the above insert of the clone was labeled by the random prime labeling method (Ready TOGO, produced by Pharma- 
cia) using a-[ 32 P]dCTP to yield a probe. Using this probe, 1 .2 x 1 0 4 clones of the cDNA library were screened by colony 

40 hybridization to obtain a clone containing a 2800 bp insert The nucleotide sequence of this clone insert was deter- 
mined by preparing deletion mutants using a kilosequence deletion kit (produced by Takara Shuzo). 

Since this clone did not contain the 3'-region of the ORP150 coding sequence, the following two 20-base oligonu- 
cleotides were prepared on the basis of the specific nucleotide sequence near the 3' end of the above insert, to obtain 
the full-length sequence. 

45 5'-GCACCCTTGAGGAAAATGCT-3' (SEQ ID NO:1 0) 
5 , -CCCAGAAGCCCAATGAGAAG-3 f (SEQ ID NO:1 1) 

Using the two oligonucleotides, a clone containing the entire coding region was selected from the rat astrocyte 
cDNA library in accordance with the protocol for the Gene Trapper cDNA Positive Selection System (produced by Life 
Technology), and its nucleotide sequence was determined. 

so The nucleotide sequence of rat ORP150 cDNA was thus determined as shown by SEQ ID NO:4 in the sequence 
listing. Based on this nucleotide sequence, the amino acid sequence of rat ORP150 was determined as shown by SEQ 
ID NO:3 in the sequence listing. 

Example 5 

55 

Preparation of human U373 cDNA library 

Poly(A)* RNA was purified from U373 cells (1 x 10 8 cells) in which human ORP150 had been induced under 
hypoxic conditions, in the same manner as described in Example 3. Double-stranded cDNA was then synthesized in 
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accordance with the protocol for the Superscript Choice System (produced by Life Technology) using a 1 :1 mixture of 
random hexamer primers and an oligo(dT) primer. This cDNA was inserted into the EcoRI site of the pSPORTI vector 
to prepare a cDNA library consisting of 2 x 10 5 independent clones. 

Specifically, the library was prepared as follows: Human U373 cells, cultured in 10 plastic petri dishes (150 mm in 

5 diameter)(1 x 1 0 7 cells/dish), were subjected to hypoxic treatment for 48 hours by the method of Ogawa et al. [Ogawa; 
S., Gerlach, H.. Esposito, C Mucaulay, A.P., Brett. J., and Stern, D., J. Clin. Invest., 85, 1090-1098 (1990)] as described 
in Example 3, after which total RNA was prepared by the acid guanidinium-phenol-chloroform method [Chomczynski, 
R and Sacchi, N. t Anal. Biochem., 162, 156-159(1987)]. Using 500 ng of the total RNA obtained, purification was con- 
ducted twice in accordance with the protocol for poly(A) + RNA purification using oligo(dT)-magnetic beads (produced 

10 by Perceptive Diagnostics), to yield poly(A) + RNA. Double-stranded cDNA was then synthesized using 5 jig of the 
poly(A) + RNA and a 1 :1 mixture of random hexamer primers and an oligo(dT) primer, in accordance with the protocol 
for the Superscript Choice System (produced by Life Technology), and inserted into the EcoRI site of the pSPORTI 
vector to prepare a human U373 cDNA library consisting of 2 x 10 5 independent clones. 

♦ 

is Example 6 

Clonino of human ORP150 cDNA 

Using two primers (SEQ ID NO:10 and SEQ ID NO:1 1) prepared on the basis of the above-described rat ORP150 

20 cDNA specific sequence, a clone containing the entire coding region was selected from the human U373 cDNA library 
in accordance with the protocol for the Gene Trapper cDNA Positive Selection System (produced by Life Technology), 
and its nucleotide sequence was determined. The nucleotide sequence of human ORP150 cDNA was thus determined 
as shown by SEQ ID NO:2 in the sequence listing. 

Specifically. 2 x 10 4 clones of the human U373 cDNA library were amplified in accordance with the protocol for the 

25 Gene Trapper cDNA Positive Selection System (produced by Life Technology). Five micrograms of the plasmid purified 
from amplified clones were treated with the Gene II and Exo III nuclease included in the kit to yield single-stranded 
DNA. An oligonucleotide (SEQ ID NO:10) prepared on the basis of the above-described rat ORP150 cDNA-specific 
sequence was biotinylated and subsequently hybridized to the above single-stranded DNA at 37°C for 1 hour. The sin- 
gle-stranded DNA hybridized to the oligonucleotide derived from rat ORP150 cDNA was selectively recovered by using 

30 streptoavidin-magnetic beads, and was treated with the repair enzyme included in the kit using the oligonucleotide 
shown by SEQ ID NO:10 in the sequence listing as a primer, to yield double-stranded plasmid DNA. 

The double-stranded plasmid DNA was then introduced to ElectroMax DH10B cells (produced by Life Technology) 
in accordance with the protocol for the Gene Trapper cDNA Positive Selection System, followed by colony PCR in 
accordance with the same protocol using two primers (SEQ ID NO:10 and SEQ ID NO:1 1) prepared on the basis of the 

35 rat ORP150 cDNA-specific sequence, to select clones that yield an about 550 bp PCR product. The nucleotide 
sequence of the longest insert among these clones, corresponding to the human ORP150 cDNA, was determined as 
shown by SEQ ID NO:2 in the sequence listing. 

On the basis of this nucleotide sequence, the amino acid sequence of human ORP150 was determined as shown 
by SEQ ID NO:1 in the sequence listing. 

40 The N-terminal amino acid sequence (SEQ ID NO: 5) obtained with purified rat ORP150 corresponded to amino 
acids 33-63 deduced from both the human and rat cDNAs, indicating that the first 32 residues represent the signal pep- 
tides for secretion. The C-terminal KNDEL sequence, which resembles KDEL sequence, a signal to retain the ER-res- 
ident proteins [Pelham, H.R.B., Trends Biochem. Sci. 15, 483-486 (1990)], may function as an ER-retention signal. The 
existence of a signal peptide at the N-terminus and the ER-retention signaHike sequence at the C-terminus suggests 

45 that ORP150 resides in the ER, consistent with the results of immunocytochemical analysis reported by Kuwabara et 
al. [Kuwabara. K., Matsumoto, M.. Ikeda, J., Hori, O., Ogawa, S.. Maeda, Y, Krtagawa, K., Imuta, N.. Kinoshita, T. 
Stern. DM., Yanagi, H., and Kamada, T., J. Biol. Chem. 271 . 5025-5032 (1996)]. 

Analysis of protein data bases with the BLAST program [Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lip- 
man, D.J., J. Mol.. Biol. 215. 403-410(1990)] showed that the N-terminal half of ORP150 has a modest similarity to the 

so ATPase domain of numerous HSP70 family sequences. An extensive analysis with pairwise alignments [Pearson, W. R. , 
and Lipman, D.J., Proc. Natl. Acad. Sci. USA 85, 2444-2448(1988)] revealed that amino acids 33-426 of human 
ORP1 50 was 32% identical to amino acids 1 -380 of both inducible human HSP70. 1 [Hunt. C, and Morimoto, R.I., Proc. 
Natl. Acad. Sci. USA 82, 6455-6459 (1985)] and constitutive bovine HSC70 [DeLuca-Flaherty. C. and McKay. D.B., 
Nucleic Acids Res. 18, 5569(1990)]. typical members of HSP70 family. An additional region similar to HSP70RY and 

55 hamster HSP1 1 0. which both belong to a new subfamily of large HSP70-like proteins [Lee-Yoon, D. Easton. D.. Muraw- 
sW, M., Burd, R. f and Subjeck, J R., J. Bid. Chem. 270, 15725-15733 (1995)], extended further to residue 487. A pro- 
tein sequence motif search with PROSITE [Bairoch, A., and Bucher, P., Nucleic Acids Res. 22, 3583-3589(1994)] 
showed that ORP150 contains two of the three HSP70 protein family signatures: FYDMGSGSTVCTIV (amino acids 
230-243, SEQ ID NO:1) and V I LVGGATRVP RVQE (amino acids 380-394, SEQ ID NO:1) which completely matched 
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with the HSP70 signatures 2 and 3, respectively, and VDLQ (amino acids 38-41 , SEQ ID NO:1) which matched with the 
first four amino acids of the signature 1 . Furthermore, the N-terminal region of ORP1 50 contained a putative ATP-bind- 
ing site consisting of the regions (amino acids 36-53, 197-214, 229-243, 378-400, and 411-425, SEQ ID NO:1) corre- 
sponding to the five motifs specified by Bork et al. [Bork, P., Sander, C, and Valencia, A., Proc. Natl. Acad. Scl. USA 
5 89, 7290-7294 (1992)]. Although the C-terminal putative peptide-binding domains of HSP70 family are generally less 
conserved [Rippmann, R, Taylor, W.R.. Rothbard, J.B., and Green, N.M.. EMBO J. 10, 1053-1059 (1991)], the C-termi- 
nal region flanked by amino acids 701 and 898 (SEQ ID NO:1) shared appreciable similarity with HSP1 1 0 (amino acids 
595-793; 29% identity). 

10 Example 7 

Cloning of human ORP150 genomic DNA 

A human genomic library purchased from Clontech (derived from human placenta, Cat. #HL1067J, Lot #1221, 2.5 
is x 10 6 independent clones) was used. A DNA fragment consisting of 202 bp of the 5' untranslated region and 369 bp of 
the coding region derived from the rat cDNA clone, as well as a 1351 bp DNA fragment containing the termination 
codon, derived from the human cDNA, were used as probes for plaque hybridization. 

Escherichia coli LE392, previously infected with 1 x 10 6 pfu of the human genomic library, was plated onto 10 petri 
dishes 15 cm in diameter to allow plaque formation. The phage DNA was transferred to a nylon membrane (Hybond- 
20 N + , Amersham) and denatured with sodium hydroxide, after which it was fixed by ultraviolet irradiation. The rat cDNA 
probe was labeled using a DNA labeling kit (Ready To Go, Pharmacia), and hybridized with the membrane in the Rapid- 
hyb buffer (Amersham). After incubation at 65°C for 2 hours, the nylon membrane was washed with 0.2 x SSC-0.1% 
SDS, and a positive clone was detected on an imaging plate (Fuji Photo Film). Since the clone isolated contained only 
exons 1 through 24, 1 .5 x 1 0 6 clones of the same library was screened again using the human cDNA probe in the same 
25 manner, resulting in isolation of one clone. This clone was found to contain exons 1 6 through 26. with an overlap with 
the 3' region of the above-mentioned clone. The entire region of the ORP1 50 gene was thus cloned by combining these 
two clones. 

These two clones were cleaved with BamHI and subcloned into pBluescript IISK (Stratagene), followed by nucle- 
otide sequence determination of the entire 15851 bp human ORP150 genomic DNA. The nucleotide sequence from the 
30 5' end to just before the translation initiation codon ATG in exon 2 is shown by SEQ ID NO: 12 in the sequence listing. 

Furthermore, the nucleotide sequence of the 15851 bp human ORP150 genomic DNA was compared with that of 
the human ORP150 cDNA shown by SEQ ID N05 in the sequence listing, resulting in the demonstration of the pres- 
ence of the exons at the positions shown below. A schematic diagram of the positions of the exons is shown in Figure 1 . 

35 
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(Base position in SEQ 
ID:2) 


5 


Exon 1 


1908 - 2002 


(1-95) 




Exon 2 


2855 - 2952 


(96-193) 




Exon 3 


3179-3272 


(194 - 287) 


10 


Exon 4 


3451 - 3529 


( 288 - 366) 




Exon 5 


3683 - 3837 


(367-521) 




Exon 6 


3962 - 4038 


( 522 - 598) 




Exon 7 


4347 - 4528 


(599 - 780) 


15 


Exon 8 


4786 - 4901 


( 781 - 896) 




Exon 9 


6193 - 6385 


(897-1089) 




Exon 10 


6593 - 6727 


( 1090-1224) 


20 


Exon 11 


6850 - 6932 


(1225- 1307) 


■ 


Exon 12 


7071 - 7203 


( 1308-1440) 




Exon 13 


7397 - 7584 


(1441-1628) 




Exon 14 


7849 - 7987 


( 1629- 1767) 


25 


Exon 15 


91 76 - 9236 


(1768-1828) 




Exon 16 


9378 - 9457 


( 1829-1908) 




Exon 17 


9810-9995 


( 1909 - 2094) 


30 


Exon 18 


10127 -10299 


( 2095 - 2267) 




Exon 19 


10450 -10537 


( 2268 - 2355) 




Exon 20 


10643 -10765 


( 2356 - 2478) 


35 


Exon 21 


10933 -11066 


(2479-2612) 


Exon 22 


11195 -11279 


(2613-2697) 




Exon 23 


12211 -12451 


( 2698 - 2938) 




Exon 24 


12546 -12596 


(2939 - 2989) 


40 


Exon 25 


13181 -13231 


( 2990 - 3040) 




Exon 26 


13358 -14823 


( 3041 - 4503) 



Example 8 

Northern blot analysis 

so A 4.5-kb EcoRI fragment of human ORP150 cDNA was labeled with [a- 32 P]dCTP(3 t 000 Ci/mmol; Amersham 
Corp., Arlington Heights, IL) by using a DNA labeling kit (Pharmacia), and used as a hybridization probe. 20ug of total 
RNA prepared from U373 cells exposed to various stresses were eiectrophoresed and transferred onto a Hybond N + 
membrane (Amersham Corp.). Multiple Tissue Northern Blots, in which each lane contained 2^g of poly(A)RNA from 
the adult human tissues indicated, was purchased from Clorrtech. The filter was hybridized at 65°C in the Rapid-hyb 

55 buffer (Amersham Corp.) with human ORP150, GRP78. HSP70, glyceraldehyde-3-phosphate dehydrogenase 
(G3PDH), and p-actin cDNAs each labeled with [a 32 -P] dCTP, washed with 0.1 x SSC containing 0.1% SDS at 65°C. 
and followed by autoradiography. 

As shown in Figure 2, the ORP150 mRNA level was highly enhanced upon 24 - 48 hours of exposure to hypoxia. 
In parallel experiments, treatment with 2-deoxyglucose (25 mM, 24 hours) or tunicamycin (5|ig/ml, 24 hours) enhanced 
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ORP150 mRNA to the levels comparable to that induced by hypoxia. The induction levels were also comparable with 
those observed for mRNA of a typical glucose-regulated protein GRP78. Heat shock treatment failed to enhance 
ORP150 mRNA appreciably. 

ORP150 mRNA was found to be highly expressed in the liver and pancreas, whereas little expression was 
5 observed in kidney and brain (Figure 3). Furthermore, the tissue specificity of ORP1 50 expression was quite similar to 
that of GRP78. The higher expression observed in the tissues that contain well-developed ER and synthesize large 
amounts of secretory proteins is consistent with the finding that ORP150 is localized in the ER (Kuwabara, K., Mat- 
sumoto, M., Ikeda, J., Hori, O., Ogawa, S., Maeda, Y, Krtagawa, K., Imuta, N. f Kinoshita, X, Stern, DM, Yanagi, H.. and 
Kamada, T., J. Biol. Chem. 271. 5025-5032(1996)). 
10 in conclusion, both the characteristic primary protein structure and the similarity found with GRP78 in stress induc- 
ibiiity and tissue specificity suggest that ORP1 50 plays an important role in protein folding and secretion in the ER, per- 
haps as a molecular chaperone, in concert with other GRPs to cope with environmental stress. 

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the present invention described specifically herein. Such equivalents are 
75 intended to be encompassed in the scope of the following claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION : 

(iii) NUMBER OF SEQUENCES: 12 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Met Ala Asp Lys Val Arg Arg Gin Arg Pro Arg Arg Arg Val Cys Trp 

5 10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 

20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 

35 40 45 

He Val Lys Pro Gly Val Pro Met Glu He Val Leu Asn Lys Glu Ser 

50 ~ 55 60 

Arg Arg Lys Thr Pro Val He Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 70 75 80 

Phe Glv Asp Ser Ala Ala Ser Met Ala lie Lys Asn Pro Lys Ala Thr 

85 90 95 

Leu Arg Tyr Phe Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His 

100 105 HO 

Val Ala Leu Tyr Gin Ala Arg Phe Pro Glu His Glu Leu Thr Phe Asp 

115 120 125 

Pro Gin Arg Gin Thr Val His Phe Glri lie Ser Ser Gin Leu Gin Phe 

130 135 140 

Ser Pro Glu Glu Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu 
145 150 155 160 

Ala Glu Asp Phe Ala Glu Gin Pro He Lys Asp Ala Val He Thr Val 

165 170 175 

Pro Val Phe Phe Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala 

180 185 190 

Arg Met Ala Gly Leu Lys Val Leu Gin Leu He Asn Asp Asn Thr Ala 

195 200 205 

Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg Lys Asp He Asn Thr Thr 

210 215 220 

Ala Gin Asn He Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys 
225 230 235 240 

Thr He Val Thr Tyr Gin Met Val Lys Thr Lys Glu Ala Gly Met Gin 

245 250 255 

Pro Gin Leu Gin He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly 

260 265 270 

Leu Glu Met Glu Leu Arg Leu Arg Glu Arg Leu Ala Gly Leu Phe Asn 

275 280 285 

Glu Gin Arg Lys Gly Gin Arg Ala Lys Asp Val Arg Glu Asn Pro Arg 

290 295 300 

Ala Met Ala Lys Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu 
305 310 315 320 



if i f 
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10 



20 



30 



40 



45 



SO 



Ser Ala Asn Ala Asp His Met Ala Gin lie Glu Gly Leu Met Asp Asp 

325 330 335 

Val Asp Phe Lys Ala Lys Val Thr Arg Val Glu Phe Glu Glu Leu Cys 

340 345 350 

Ala Asp Leu Phe Glu Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin 

355 360 365 

Ser Ala Glu Met Ser Leu Asp Glu He Glu Gin Val He Leu Val Gly 

370 375 380 

Gly Ala Thr Arg Val Pro Arg Val Gin Glu Val Leu Leu Lys Ala Val 
385 390 395 400 

Gly Lys Glu Glu Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala 

405 410 415 

Met Gly Ala Val Tyr Gin Ala Ala Ala Leu Ser Lys Ala Phe Lys Val 

420 425 430 

15 Lys Pro Phe Val Val Arg Asp Ala Val Val Tyr Pro lie Leu Val Glu 

435 440 445 

Phe Thr Arg Glu Val Glu Glu Glu Pro Gly lie His Ser Leu Lys His 

450 455 460 

Asn Lys Arg Val Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys 
465 470 475 480 

Val He Thr Phe Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn 

485 490 495 

Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly 

500 505 510 

Ser Gin Asn Leu Thr Thr Val Lys Leu Lys Gly Val Gly Asp Ser Phe 
25 515 520 525 

Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly He Lys Ala His Phe Asn 

530 535 540 

Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe 
545 550 555 560 

Glu Thr Leu Val Glu Asp Ser Ala Glu Glu Glu Ser Thr Leu Thr Lys 

565 570 575 

Leu Gly Asn Thr He Ser Ser Leu Phe Gly Gly Gly Thr Thr Pro Asp 

580 585 590 

Ala Lys Glu Asn Gly Thr Asp Thr Val Gin Glu Glu Glu Glu Ser Pro 
595 600 605 

35 Ala Glu Gly Ser Lys Asp Glu Pro Gly Glu Gin Val Glu Leu Lys Glu 

610 615 620 

Glu Ala Glu Ala Pro Val Glu Asp Gly Ser Gin Pro Pro Pro Pro Glu 
625 630 635 640 

Pro Lys Gly Asp Ala Thr Pro Glu Gly Glu Lys Ala Thr Glu Lys Glu 

645 650 655 

Asn Gly Asp Lys Ser Glu Ala Gin Lys Pro Ser Glu Lys Ala Glu Ala 

660 665 670 

Gly Pro Glu Gly Val Ala Pro Ala Pro Glu Gly Glu Lys Lys Gin Lys 

675 680 685 

Pro Ala Arg Lys Arg Arg Met Val Glu Glu He Gly Val Glu Leu Val 

690 695 700 

Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Lys Leu Ala Gin Ser Val 
705 710 715 720 

Gin Lys Leu Gin Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 

725 730 735 

Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe He Phe Glu Thr Gin Asp 

740 745 750 

Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg 
755 760 765 



55 
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10 



15 



20 



25 



30 



45 



55 



Glu 


Glu 


lie 


Ser Gly 


Lys 


Leu 


Ser 


Ala Ala Ser 


Thr Trp Leu Glu Asp 




770 








775 






780 


Gin 


Gly Val 


Gly Ala Thr 


1ILL 




Met Leu Lvs 


Glu Lvs Leu Ala Glu 


/ O J 










790 






795 


800 


T .01 1 
l-i tS LI 




T,\TQ 

J-ijf S> 




Cys 


Gin 


v»j.y 




pVip Phe Am 


Val Glu Glu Arcr Lvs 






805 






810 


815 


Lvs 


1 rp 




Glu 


Arg 


Leu 


Ser 


Ala 


Leu Asp Asn 


Leu Leu Asn His Ser 




820 








825 


830 


Ser 


W6t 


trie. 


Leu 


Lys Gly 


Ala 


Arg 


Leu lie Pro 


Glu Met Asd Gin lie 






835 










840 




845 


* 1 its 


Thr 
850 


Glu 


Val 


Glu 


Met 


Thr 
855 


Thr 


Leu Glu Lys 


Val Xle Asn Glu Thr 
86O 

W w V* 


1 xrp 


Ala 


Trp 


Lys 


Asn 


Ala 


Thr 


Leu 


Ala Glu Gin 


Ala Lvs Leu Pro Ala 


0 0 j 






870 






875 


0 0 w 


Thr 


Glu 


Lys 


Pro 


Val 


Leu 


Leu 


Ser 


Lys Asp lie 


Glu Ala Lvs Met Met 








885 








890 


07 j 


Ala 


Leu 


Asp 


Arg 


Glu 


Val 


Gin Tyr Leu Leu Asn 


Ijya Ala ijjf 0 x 1 its a. III. 






900 










905 




Lys 


Pro 


Arg 


Pro Arg 


Pro 


Lys 


Asp 


Lys Asn Gly 


Thr Arg Ala Glu Pro 






915 










920 




7AJ 


Pro 


Leu 


Asn 


Ala 


Ser 


Ala 


Ser 


Asp 


Gin Gly Glu 


Lys Val lie Pro Pro 




930 










935 






940 


Ala 


Gly Gin 


Thr 


Glu 


Asp 


Ala 


Glu 


Pro lie Ser 


Glu Pro Glu Lys Val 


945 










950 






955 


960 


Glu 


Thr 


Gly 


Ser 


Glu 


Pro 


Gly Asp Thr Glu Pro 


Leu Glu Leu Gly Gly 








965 








970 


975 


Pro 


Gly Ala 


Glu 


Pro 


Glu 


Gin 


Lys 


Glu Gin Ser 


Thr Gly Gin Lys Arg 








980 








985 


990 


Pro 


Leu 


Lys 
995 


Asn 


Asp 


Glu 


Leu 









(2) INFORMATION FOR SEQ ID NO: 2; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4503 base pairs 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS : double 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE 
40 (A) NAME/KEY: CDS 

(B) IDENTIFICATION METHOD: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTGTGAAGGG CGCGGGTGGG GGGCGCTGCC GGCCTCGTGG GTACGTTCGT GCCGCGTCTG 60 

TCCCAGAGCT GGGGCCGCAG GAGCGGAGGC AAGAGGGGCA CTATGGCAGA CAAAGTTAGG 120 

AGGCAGAGGC CGAGGAGGCG AGTCTGTTGG GCCTTGGTGG CTGTGCTCTT GGCAGACCTG 180 

50 TTGGCACTGA GTGATACACT GGCAGTGATG TCTGTGGACC TGGGCAGTGA GTCCATGAAG 240 
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GTGGCCATTG TCAAACCTGG AGTGCCCATG GAAATTGTCT TGAATAAGGA ATCTCGGAGG 300 
AAAACACCGG TGATCGTGAC CCTGAAAGAA AATGAAAGAT TCTTTGGAGA CAGTGCAGCA 360 

5 

AGCATGGCGA TTAAGAATCC AAAGGCTACG CTACGTTACT TCCAGCACCT CCTGGGGAAG 420 
CAGGCAGATA ACCCCCATGT AGCTCTTTAC CAGGCCCGCT TCCCGGAGCA CGAGCTGACT 480 
TTCGACCCAC AGAGGCAGAC TGTGCACTTT CAGATCAGCT CGCAGCTGCA GTTCTCACCT 540 

10 

GAGGAAGTGT TGGGCATGGT TCTCAATTAT TCTCGTTCTC TAGCTGAAGA TTTTGCAGAG 600 
CAGCCCATCA AGGATGCAGT GATCACCGTG CCAGTCTTCT TCAACCAGGC CGAGCGCCGA 660 

15 GCTGTGCTGC AGGCTGCTCG TATGGCTGGC CTCAAAGTGC TGCAGCTCAT CAATGACAAC 720 

ACCGCCACTG CCCTCAGCTA TGGTGTCTTC CGCCGGAAAG ATATTAACAC CACTGCCCAG 780 
AATATCATGT TCTATGACAT GGGCTCAGGC AGCACCGTAT GCACCATTGT GACCTACCAG 840 

so ATGGTGAAGA CTAAGGAAGC TGGGATGCAG CCACAGCTGC AGATCCGGGG AGTAGGATTT 900 

GACCGTACCC TGGGGGGCCT GGAGATGGAG CTCCGGCTTC GAGAACGCCT GGCTGGGCTT 960 
TTCAATGAGC AGCGCAAGGG TCAGAGAGCA AAGGAT GTGC GGGAGAACCC GCGTGCCATG 1020 

25 GCCAAGCTGC TGCGTGAGGC TAATCGGCTC AAAACCGTCC TCAGTGCCAA CGCTGACCAC 1080 

ATGGCACAGA TTGAAGGCCT GATGGATGAT GTGGACTTCA AGGCAAAAGT GACTCGTGTG 1140 
GAATTTGAGG AGTTGTGTGC AGACTTGTTT GAGCGGGTGC CTGGGCCTGT ACAGCAGGCC 1200 

30 

CTCCAGAGTG CCGAAATGAG TCTGGATGAG ATTGAGCAGG TGATCCTGGT GGGTGGGGCC 1260 
ACTCGGGTCC CCAGAGTTCA GGAGGTGCTG CTGAAGGCCG TGGGCAAGGA GGAGCTGGGG 1320 

3$ AAGAACATCA ATGCAGATGA AGCAGCCGCC ATGGGGGCAG TGTACCAGGC AGCTGCGCTC 1380 

AGCAAAGCCT TTAAAGTGAA GCCATTTGTC GTCCGAGATG CAGTGGTCTA CCCCATCCTG 1440 
GTGGAGTTCA CGAGGGAGGT GGAGGAGGAG CCTGGGATTC ACAGCCTGAA GCACAATAAA 1500 

40 CGGGTACTCT TCTCTCGGAT GGGGCCCTAC CCTCAACGCA AAGTCATCAC CTTTAACCGC 1560 

TACAGCCATG ATTTCAACTT CCACATCAAC TACGGCGACC TGGGCTTCCT GGGGCCTGAA 1620 
GATCTTCGGG TATTTGGCTC CCAGAATCTG ACCACAGTGA AGCTAAAAGG GGTGGGTGAC 1680 

45 AGCTTCAAGA AGTATCCTGA CTACGAGTCC AAGGGCATCA AGGCTCACTT CAACCTGGAT 1740 

GAGAGTGGCG TGCTCAGTCT AGACAGGGTG GAGTCTGTAT TTGAGACACT GGTAGAGGAC 1800 
AGCGCAGAAG AGGAATCTAC TCTCACCAAA CTTGGCAACA CCATTTCCAG CCTGTTTGGA 1860 

so 

GGCGGTACCA CACCAGATGC CAAGGAGAAT GGTACTGATA CTGTCCAGGA GGAAGAGGAG 1920 

55 
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AGCCCTGCAG AGGGGAGCAA GGACGAGCCT GGGGAGCAGG TGGAGCTCAA GGAGGAAGCT 1980 
GAGGCCCCAG TGGAGGATGG CTCTCAGCCC CCACCCCCTG AACCTAAGGG AGATGCAACC 2040 
CCTGAGGGAG AAAAGGCCAC AGAAAAAGAA AATGGGGACA AGTCTGAGGC CCAGAAACCA 2100 
AGTGAGAAGG CAGAGGCAGG GCCTGAGGGC GTCGCTCCAG CCCCAGAGGG AGAGAAGAAG 2160 
CAGAAGCCCG CCAGGAAGCG GCGAATGGTA GAGGAGATCG GGGTGGAGCT GGTTGTTCTG 2220 
GACCTGCCTG ACTTGCCAGA GG AT AAGCTG GCTCAGTCGG TGCAGAAACT TCAGGACTTG 2280 
ACACTCCGAG ACCTGGAGAA GCAGGAACGG GAAAAAGCTG CCAACAGCTT GGAAGCGTTC 2340 
ATATTTGAGA CCCAGGACAA GCTGTACCAG CCCGAGTACC AGGAAGTGTC CACAGAGGAG 2400 
CAGCGTGAGG AGATCTCTGG GAAGCTCAGC GCCGCATCCA CCTGGCTGGA GGATGAGGGT 2460 
GTTGGAGCCA CCACAGTGAT GTTGAAGGAG AAGCTGGCTG AGCTGAGGAA GCTGTGCCAA 2520 
GGGCTGTTTT TTCGGGTAGA GGAGCGCAAG AAGTGGCCCG AACGGCTGTC TGCCCTCGAT 2580 
AATCTCCTCA ACCATTCCAG CATGTTCCTC AAGGGGGCCC GGCTCATCCC AGAGATGGAC 2640 
CAGATCTTCA CTGAGGTGGA GATGACAACG TTAGAGAAAG TCATCAATGA GACCTGGGCC 2700 
TGGAAGAATG CAACTCTGGC CGAGCAGGCT AAGCTGCCCG CCACAGAGAA GCCTGTGTTG 2760 
CTCTCAAAAG ACATTGAAGC TAAGATGATG GCCCTGGACC GAGAGGTGCA GTATCTGCTC 2820 
AATAAGGCCA AGTTTACCAA GCCCCGGCCC CGGCCTAAGG ACAAGAATGG GACCCGGGCA 2880 
GAGCCACCCC TCAATGCCAG TGCCAGTGAC CAGGGGGAGA AGGTCATCCC TCCAGCAGGC 2940 
CAGACTGAAG ATGCAGAGCC CATTTCAGAA CCTGAGAAAG TAGAGACTGG ATCCGAGCCA 3000 
GGAGACACTG AGCCTTTGGA GTTAGGAGGT CCTGGAGCAG AACCTGAACA GAAAGAACAA 3060 
TCGACAGGAC AGAAGCGGCC TTTGAAGAAC GACGAACTAT AACCCCCACC TCTGTTTTCC 3120 
CCATTCATCT CCACCCCCTT CCCCCACCAC TTCTATTTAT TTAACATCGA GGGTTGGGGG 3180 
AGGGGTTGGT CCTGCCCTCG GCTGGAGTTC CTTTCTCACC CCTGTGATTT GGAGGTGTGG 3240 
AGAAGGGGAA GGGAGGGACA GCTCACTGGT TCCTTCTGCA GTACCTCTGT GGTTAAAAAT 3300 
GGAAACTGTT CTCCTCCCCA GCCCCACTCC CTGTTCCCTA CCCATATAGG CCCTAAATTT 3360 
GGGAAAAATC ACTATTAATT TCTGAATCCT TTGCCTGTGG GTAGGAAGAG AATGGCTGCC 3420 
AGTGGCTGAT GGGTCCCGGT GATGGGAAGG GTATCAGGTT GCTGGGGAGT TTCCACTCTT 3480 
CTCTGGTGAT TGTTCCTTCC CTCCCTTCCT CTCCCACCAT GCGATGAGCA TCCTTTCAGG 3540 
CCAGTGTCTG CAGAGCCTCA GTTACCAGGT TTGGTTTCTG AGTGCCTATC TGTGCTCTTT 3600 
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CCfCCCTCTG CGGGCTTCTC TTGCTCTGAG CCTCCCTTCC CCATTCCCAT GCAGCTCCTT 3660 
TCCCCCTGGG TTTCCTTGGC TTCCTGCAGC AAATTGGGCA GTTCTCTGCC CCTTGCCTAA 3720 
AAGCCTGTAC CTCTGGATTG GCGGAAGTAA ATCTGGAAGG ATTCTCACTC GTATTTCGCA 3780 
CCCCTAGTGG CCAGAGGAGG GAGGGGCACA GTGAAGAAGG GAGCCCACCA CCTCTCCGAA 3840 
GAGGAAAGCC ACGTAGAGTG GTTGGCATGG GGTGCCAGCA TCGTGCAAGC TCTGTCATAA 3900 
TCTGCATCTT CCCAGCAGCC TGGTACCCCA GGTTCCTGTA ACTCCCTGCC TCCTCCTCTC 3960 
TTCTGCTGTT CTGCTCCTCC CAGACAGAGC CTTTCCCTCA CCCCCTGACC CCCTGGGCTG 4020 
ACCAAAATGT GCTTTCTACT GTGAGTCCCT ATCCCAAGAT CCTGGGGAAA GGAGAGACCA 4080 
TGGTGTGAAT GTAGAGATGC CACCTCCCTC TCTCTGAGGC AGGCCTGTGG ATGAAGGAGG 4140 
AGGGTCAGGG CTGGCCTTCC TCTGTGCATC ACTCTGCTAG GTTGGGGGCC CCCGACCCAC 4200 
CATACCTACG CCTAGGGAGC CCGTCCTCCA GTATTCCGTC TGTAGCAGGA GCTAGGGCTG 4260 
CTGCCTCAGC TCCAAGACAA GAATGAACCT GGCTGTTGCA GTGATTTTGT CTTTTCCTTT 4320 
TTTTTTTTTT GCCACATTGG CAGAGATGGG ACCTAAGGGT CCCACCCCTC ACCCCACCCC 4380 
CACCTCTTCT GTATGTTTGA ATTCTTTCAG TAGCTGTTGA TGCTGGTTGG ACAGGTTTGA 4440 
GTCAAATTGT ACTTTGCTCC ATTGTTAATT GAGAAACTGT TTCAATAAAA TATTCTTTTC 4500 
TAC 4503 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Ala Ala Thr Val Arg Arg Gin Arg Pro Arg Arg Leu Leu Cys Trp 

5 ~10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 

20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 

35 40 45 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser 

50 55 60 

Arg Arg Lys Thr Pro Val Thr Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 70 75 80 

Leu Gly Asp Ser Ala Ala Gly Met Ala lie Lys Asn Pro Lys Ala Thr 

85 90 95 
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Leu Arg Tyr Phe 

100 

Val Ala Leu Tyr 
115 

Pro Gin Arg Gin 
130 

Ser Pro Glu Glu 
145 

Ala Glu Asp Phe 

Pro Ala Phe Phe 

180 

Arg Met: Ala Gly 
195 

Thr Ala Leu Ser 
210 

Ala Gin Asn lie 
225 

Thr lie Val Thr 

Pro Gin Leu Gin 

260 

Leu Glu Met Glu 
275 

Glu Gin Arg Lys 
290 

Ala Met: Ala Lys 
305 

Ser Ala Asn Ala 

Val Asp Phe Lys 

340 

Ala Asp Leu Phe 
355 

Ser Ala Glu Met: 
370 

Gly Pro Thr Arg 
385 

Gly Lys Glu Glu 

Met: Gly Ala Val 

420 

Lys Pro Phe Val 
435 

Phe Thr Arg Glu 
450 

Asn Lys Arg Val 
465 

Val He Thr Phe 

Tyr Gly Asp Leu 

500 

Ser Gin Asn Leu 
515 

Lys Lys Tyr Pro 
530 



Gin His Leu Leu 

Arg Ser Arg Phe 

120 

Thr Val Arg Phe 
135 

Val Leu Gly Met; 
150 

Ala Glu Gin Pro 
165 

Asn Gin Ala Glu 

Leu Lys Val Leu 

200 

Tyr Gly Val Phe 
215 

Met: Phe Tyr Asp 
230 

Tyr Gin Thr Val 
245 

lie Arg Gly Val 

Leu Arg Leu Arg 

280 

Gly Gin Lys Ala 
295 

Leu Leu Arg Glu 
310 

Asp His Met Ala 
325 

Ala Lys Val Thr 

Asp Arg Val Pro 

360 

Ser Leu Asp Gin 
375 

Val Pro Lys Val 
390 

Leu Gly Lys Asn 
405 

Tyr Gin Ala Ala 

Val Arg Asp Ala 

440 

Val Glu Glu Glu 
455 

Leu Phe Ser Arg 
470 

Asn Arg Tyr Ser 
485 

Gly Phe Leu Gly 

Thr Thr Val Lys 

520 

Asp Tyr Glu Ser 

535 



Gly Lys Gin Ala 
105 

Pro Glu His Glu 

Gin He Ser Pro 

140 

Val Leu Asn Tyr 
155 

lie Lys Asp Ala 
170 

Arg Arg Ala Val 
185 

Gin Leu lie Asn 

Arg Arg Lys Asp 

220 

Met Gly Ser Gly 
235 

Lys Thr Lys Glu 
250 

Gly Phe Asp Arg 
265 

Glu His Leu Ala 

Lys Asp Val Arg 

300 

Ala Asn Arg Leu 
315 

Gin lie Glu Gly 
330 

Arg Val Glu Phe 
345 

Gly Pro Val Gin 

lie Glu Gin Val 

380 

Gin Glu Val Leu 
395 

lie Asn Ala Asp 
410 

Ala Leu Ser Lys 
425 

Val He Tyr Pro 

Pro Gly Leu Arg 

460 

Met Gly Pro Tyr 
475 

His Asp Phe Asn 
490 

Pro Glu Asp Leu 
505 

Leu Lys Gly Val 

Lys Gly He Lys 

540 



Asp Asn Pro His 
110 

Leu Asn Val Asp 
125 

Gin Leu Gin Phe 

Ser Arg Ser Leu 

160 

Val He Thr Val 
175 

Leu Gin Ala Ala 
190 

Asp Asn Thr Ala 
205 

He Asn Ser Thr 

Ser Thr Val Cys 

240 

Ala Gly Thr Gin 
255 

Thr Leu Gly Gly 
270 

Lys Leu Phe Asn 
285 

Glu Asn Pro Arg 

Lys Thr Val Leu 

320 

Leu Met Asp Asp 
335 

Glu Glu Leu Cys 
350 

Gin Ala Leu Gin 
365 

He Leu Val Gly 

Leu Lys Pro Val 

400 

Glu Ala Ala Ala 
415 

Ala Phe Lys Val 
430 

He Leu Val Glu 
445 

Ser Leu Lys His 

Pro Gin Arg Lys 

480 

Phe His He Asn 
495 

Arg Val Phe Gly 
510 

Gly Glu Ser Phe 
525 

Ala His Phe Asn 
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15 



20 



25 



30 



40 



SO 



Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe 
545 550 555 560 

Glu Thr Leu Val Glu Asp Ser Pro Glu Glu Glu Ser Thr Leu Thr Lys 
5 565 570 575 

Leu Gly Asn Thr lie Ser Ser Leu Phe Gly Gly Gly Thr Ser Ser Asp 

580 585 590 

Ala Lys Glu Asn Gly Thr Asp Ala Val Gin Glu Glu Glu Glu Ser Pro 

595 600 605 

Ala Glu Gly Ser Lys Asp Glu Pro Ala Glu Gin Gly Glu Leu Lys Glu 
10 610 615 620 

Glu Ala Glu Ala Pro Met Glu Asp Thr Ser Gin Pro Pro Pro Ser Glu 
625 630 635 640 

Pro Lys Gly Asp Ala Ala Arg Glu Gly Glu Thr Pro Asp Glu Lys Glu 

645 650 655 

Ser Gly Asp Lys Ser Glu Ala Gin Lys Pro Asn Glu Lys Gly Gin Ala 

660 665 670 

Gly Pro Glu Gly Val Pro Pro Ala Pro Glu Glu Glu Lys Lys Gin Lys 

675 680 685 

Pro Ala Arg Lys Gin Lys Met Val Glu Glu lie Gly Val Glu Leu Ala 

690 695 700 

Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Glu Leu Ala His Ser Val 
705 710 715 720 

Gin Lys Leu Glu Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 

725 730 735 

Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe lie Phe Glu Thr Gin Asp 

740 745 750 

Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg 

755 760 765 

Glu Glu lie Ser Gly Lys Leu Ser Ala Thr Ser Thr Trp Leu Glu Asp 

770 775 780 

Glu Gly Phe Gly Ala Thr Thr Val Met Leu Lys Asp Lys Leu Ala Glu 
785 790 795 800 

Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Arg 

805 810 815 

Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser 

820 825 830 

35 Ser lie Phe Leu Lys Gly Ala Arg Leu lie Pro Glu Met Asp Gin lie 

835 840 845 

Phe Thr Asp Val Glu Met Thr Thr Leu Glu Lys Val lie Asn Asp Thr 

850 855 860 

Trp Thr Trp Lys Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala 
865 870 875 880 

Thr Glu Lys Pro Val Leu Leu Ser Lys Asp lie Glu Ala Lys Met Met 

885 890 895 

Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu Asn Lys Ala Lys Phe Thr 

900 905 910 

Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn Gly Thr Arg Thr Glu Pro 
45 915 920 925 

Pro Leu Asn Ala Ser Ala Gly Asp Gin Glu Glu Lys Val lie Pro Pro 

930 935 940 

Thr Gly Gin Thr Glu Glu Ala Lys Ala lie Leu Glu Pro Asp Lys Glu 
945 950 955 960 

Gly Leu Gly Thr Glu Ala Ala Asp Ser Glu Pro Leu Glu Leu Gly Gly 

965 970 975 

Pro Gly Ala Glu Ser Glu Gin Ala Glu Gin Thr Ala Gly Gin Lys Arg 

980 985 990 -* 
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Pro Leu Lys Asn Asp Glu Leu 
995 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3252 base pairs 

(B ) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) IDENTIFICATION METHOD: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TGAGGATGGA GCAGCGGTCG GGCCGCGGCT CCTAGGGGAG GCAGCGTGCT AGCT TCGGGG 60 
GCGGGCCAGT AGCGGGAGCG AGGGCCGTAC GGACACCGGT CCCTTCGGCC TTGAAGTTCA 120 
GGCGCTGAGC TGCCCCCTCG CGCTCGGGGT GGGCCGGAAT CCATTTCTGG GAGTGGGATC 180 
TTCCACCTTC ATCAGGGTCA CAATGGCAGC TACAGTAAGG AGGCAGAGGC CAAGGAGGCT 240 
ACTCTGTTGG GCCTTGGTGG CTGTCCTCTT GGCAGACCTG TTGGCACTGA GTGACACACT 300 
GGCTGTGATG TCTGTGGACC TGGGCAGTGA ATCCATGAAG GTGGCCATTG TCAAGCCTGG 360 
AGTGCCCATG GAGATTGTAT TGAACAAGGA ATCTCGGAGG AAAACTCCGG TGACTGTGAC 420 
CTTGAAGGAA AACGAAAGGT TTCTAGGTGA CAGTGCAGCT GGCATGGCCA TCAAGAACCC 480 
AAAGGCTACG CTCCGTTATT TCCAGCACCT CCTTGGAAAG CAGGCAGATA ACCCTCATGT 540 
GGCTCTT TAC CGGTCCCGTT TCCCAGAACA TGAGCTCAAT GTTGACCCAC AGAGGCAGAC 600 
TGTGCGCTTC CAGATCAGTC CGCAGCTGCA GTTCTCTCCC GAGGAGGTGC TGGGCATGGT 660 
TCTCAACTAC TCCCGTTCCC TGGCTGAAGA TTTTGCAGAA CAACCTATTA AGGATGCAGT 720 
GATCACCGTG CCAGCCTTTT TCAACCAGGC CGAGCGCCGA GCTGTGCTGC AGGCTGCTCG 780 
TATGGCTGGC CTCAAGGTGC TGCAGCTCAT CAATGACAAC ACTGCCACAG CCCTCAGCTA 840 
TGGTGTCTTC CGCCGGAAAG ATATCAATTC CACTGCACAG AATATCATGT TCTATGACAT 900 
GGGCTCGGGC AGCACTGTGT GTACCATCGT GACCTACCAA ACGGTGAAGA CTAAGGAGGC 960 
TGGGACGCAG CCACAGCTAC AGATCCGGGG CGTGGGATTT GACCGCACCC TGGGTGGCCT 1020 
GGAGATGGAG CTTCGGCTGC GAGAGCACCT GGCTAAGCTC TTCAATGAGC AGCGCAAGGG 1080 
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CCAGAAAGCC AAGGATGTTC GGGAAAACCC CCGAGCCATG GCCAAACTGC TTCGGGAAGC 1140 
CAATCGGCTT AAAACCGTCC TGAGTGCCAA TGCTGATCAC ATGGCACAGA TTGAAGGCTT 1200 

5 

GATGGACGAT GTGGACTTCA AGGCAAAAGT AACTCGAGTG GAGTTTGAGG AGCTGTGTGC 1260 
AGATTTGTTT GATCGAGTGC CTGGGCCTGT ACAGCAGGCC CTGCAGAGTG CTGAGATGAG 1320 
CCTGGATCAA ATTGAGCAGG TGATCCTGGT GGGTGGGCCC ACTCGTGTTC CCAAAGTTCA 1380 

10 

AGAGGTGCTG CTGAAGCCTG TGGGCAAGGA GGAACTAGGA AAGAACATCA ATGCCGATGA 1440 
AGCAGCTGCC ATGGGGGCCG TGTACCAGGC AGCGGCACTG AGCAAAGCCT TCAAAGTGAA 1500 

75 GCCATTTGTT GTGCGTGATG CTGTTATTTA CCCCATCCTG GTGGAGTTCA CAAGGGAGGT 1560 

GGAGGAGGAG CCTGGGCTTC GAAGCCTGAA GCACAATAAA CGTGTGCTCT TCTCCCGAAT 1620 
GGGGCCCTAC CCTCAGCGCA AAGTCATCAC CTTTAACCGA TACAGCCATG ATTTCAACTT 1680 

20 TCACATCAAC TACGGTGACC TGGGCTTCCT GGGGCCTGAG GATCTTCGGG TATTTGGCTC 1740 

CCAGAATCTG ACCACAGTGA AACTAAAAGG TGTGGGAGAG AGCTTCAAGA AATATCCTGA 1 800 
CTATGAGTCC AAAGGCATCA AGGCCCACTT TAACCTAGAC GAGAGTGGAG TGCTCAGTTT 1860 

25 

AGACAGGGTG GAGTCCGTAT TCGAGACCCT GGTGGAGGAC AGCCCAGAGG AAGAGTCTAC 1920 
TCTTACCAAA CTTGGCAACA CCATTTCCAG CCTGTTTGGC GGTGGTACCT CATCAGATGC 1980 
CAAAGAGAAT GGTACTGATG CTGTACAGGA GGAGGAGGAG AGCCCTGCTG AGGGGAGCAA 2040 

30 

GGATGAGCCT GCAGAACAGG GGGAACTCAA GGAGGAAGCT GAAGCCCCAA TGGAGGATAC 2100 
CTCCCAGCCT CCACCCTCTG AGCCTAAGGG GGATGCAGCC CGTGAGGGAG AAACACCTGA 2160 

35 TGAAAAAGAA AGTGGGGACA AGTCTGAGGC CCAGAAGCCC AATGAGAAGG GGCAGGCAGG 2220 

GCCTGAGGGT GTCCCTCCAG CTCCCGAGGA AGAAAAAAAG CAGAAACCTG CCCGGAAGCA 2280 
GAAAATGGTG GAGGAGATAG GTGTGGAACT GGCTGTCTTG GACCTGCCAG ACTTGCCAGA 2340 

40 GGATGAGCTG GCCCATTCCG TGCAGAAACT TGAGGACTTG ACCCTGCGAG ACCTTGAAAA 2400 

GCAGGAGAGG GAGAAAGCTG CCAACAGCTT AGAAGCTTTT ATCTTTGAGA CCCAGGACAA 2460 
ACTGTACCAA CCTGAGTACC AGGAAGTGTC CACTGAGGAA CAACGGGAGG AGATCTCTGG 2520 

45 

AAAACTCAGT GCCACTTCTA CCTGGCTGGA GGATGAGGGA TTTGGAGCCA CCACTGTGAT 2580 
GTTGAAGGAC AAGCTGGCTG AGCTGAGAAA GCTGTGCCAA GGGCTGTTTT TTCGGGTGGA 2640 
AGAGCGCAGG AAATGGCCAG AGCGGCTTTC AGCTCTGGAT AATCTCCTCA ATCACTCCAG 2700 

50 

CATTTTCCTC AAGGGTGCCC GACTCATCCC AGAGATGGAC CAGATCTTCA CTGACGTGGA 2760 
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V 

GATGACAACG TTGGAGAAAG TCATCAATGA CACCTGGACC TGGAAGAATG CAACCCTGGC 2820 
CGAGCAGGCC AAGCTTCCTG CCACAGAGAA ACCCGTGCTG CTTTCAAAAG ACATCGAGGC 2880 
CAAAATGATG GCCCTGGACC GGGAGGTGCA GTATCTACTC AATAAGGCCA AGTTTACTAA 2940 
ACCCCGGCCA CGGCCCAAGG ACAAGAATGG GACCCGGACA GAGCCTCCCC TCAATGCCAG 3000 
TGCTGGTGAC CAAGAGGAAA AGGTCATTCC ACCTACAGGC CAGACTGAAG AGGCGAAGGC 3060 
CATCTTAGAA CCTGACAAAG AAGGGCTTGG TACAGAGGCA GCAGACTCTG AGCCTCTGGA 3120 
ATTAGGAGGT CCTGGTGCAG AATCTGAACA GGCAGAGCAG ACAGCAGGGC AGAAGCGGCC 3180 
TTTGAAGAAT GATGAGCTGT GACCCCGCGC CTCCGCTCCA CTTGCCTCCA GCCCCTTCTC 3240 
CTACCACCTC TA 3252 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 
30 5 10 15 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu 

20 25 30 

35 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

acid 



25 



45 



SO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
AATACGACTC ACTATAGGGA 20 

(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

Lys Pro Gly Val Pro Met Glu 

5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

15 (B) TYPE: nucleic acid 

( C ) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



10 



20 



25 



30 



35 



40 



50 



(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
AARCC1GG1G TNCCNATGGA 20 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 

Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu 

5 10 

(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
^ (D) TOPOLOGY: linear 

> 

(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
GCACCCTTGA GGAAAATGCT 20 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

( B ) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid, synthetic nucleic 

acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCCAGAAGCC CAATGAGAAG 20 

(2) INFORMATION FOR SEQ ID N0:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2861 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GAAAGAAGTA GACATGGGAG ACTTCATTTT GTTCTGTACT AAGAAAAATT CTTCTGCCTT 60 

GGG ATGCTGT TGATCTATGA CCTTACCCCC AACCCTGTGC TCTCTGAAAC ATGTGCTGTG 120 

TCCACTCAGG GTTAAATGGA TTAAGGGCGG TGCAAGATGT GCTTTGTTAA ACAGATGCTT 180 

GAAGGCAGCA TGCTCGTTAG GAGTCATCAC CACTCCCTAA TCTCAAGTAC CCAGGGACAC 240 

AAACACTGCG GAAGGCCACA GGGTCCTCTG CCTAGGAAAG CCAGAGACCT TTGTTCACTT 300 

GTTTATCTGC TGACCTTCCC TCCACTATTG TCCTATGACC CTGCCAAATC CCCCTCTGCC 360 

AGAAACACCC AAGAATGATC AATAAAAAAA AAAAAAAAAA AAAAAGGAAG AATAGACTCT 420 

CTCTGGGACT GCCAATAATT TTTCCTTCTA AGCATAGACA CCGGACCACT CTCCACCTAA 480 

GCATCACGAA AAATGTAGAG AAAGGAAGAG CTAAGAGCTC CTTAAACAAG TTCAGGCTTG 540 

ACACAACCCT GGCCCTGACA GCCAGGGTCT TCAAGCGGGC CTTTCTGTGA AGGGTGGCCA 600 

GGCATCAACT TAGTAGGAGA GAAAACAGAT GACTTATTTC CATCCACACT TAAGGAAAAT 660 

GCAGTCTCCA AGGACTGCGT AC AT TTCTTT TTCGAGAAGG AGTCTCGCTG TTGTCGCCCA 720 

GGCTGGAGTG CAGTGGCGCA GTCTGGGCTC ACAGCAACCT CTGCCTCCCG GATTCAAGCA 780 

ATTCTCCTGC CTCAGCCTCG TGAGTAGCTG GG ATT AC AGG CACCCGCCAC CACGCCTGGC 840 
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TAATTTTTGT AGTTTTGGTA GAGACGGGGT TTCACCATGT TGGCCAGGCT GGTCTCGAAC 900 
TCCTGACCTC CAGTGATTCG CCCGCCTTGG CCTCCCAAAA TGCTGGGATT ACAGGCGTGA 960 
GCCACCGCGC CCGGGCGACT GCGCACATTT CTATGGAGCT GTAAGTTAAA AGAGAAGGCA 1020 
GTGAGGTGCT TCTGTCATTC TATGACAGAA ACAGCTAAAG AGTAGAGAAA TGTTCACAAG 1080 
ATTTAATAGA ACAGAAATAG GAGAAGGTGC ACACAAGCTC AACCAACTAT AGCCTCACAA 1140 

o 

ATAAAAGTGT CTTTTGTGTG TAGTACTTAA GTTTGGAATA TTCTTTCTTA TACAAATGAG 1200 
TGGGGCTTAA CCTAAGAAAT CCTGGCCAGA TTCTGCGACG AATGCATCGG TTATCTCTGA 12 60 
CCCATCAGCA AACATCTTTT TCTGTGGCTT CAGTTTCCTC AGTAAAACAG AGGGGGTTGC 1320 

5 

GACGGACTCA GTCCGAGGCA CAGCCATTCT CCAACGTCTA TCCAAAGCCT AGGGCACCTC 1380 
AATACTAACC GGCAGGCCAG CGCCCCCTCC GCGGGGCTGC GGACAGGACG CCTGTTATTC 1440 
» CATTCCTCGG CCGGGCTCTA CAGGTGACCG GAAGAAGAGC CCCGAGTGCG GGACTGCAGT 1500 
GCGCCCGACC TGCTCTAGGC GCAGGTCACT CCCGAACCCC GGCAGCAAAG CATCCAGCGC 1560 
CGGAAAAGGT CCCGCGGTCG CCCCGGGGCC GGCGCTGGGG AGGAAGGAGT GGAGCGCGCT 1620 
?5 GGCCCCGTGA CGTGGTCCAA TCCCAGGCCG ACGCCGGCTG CTTCTGCCCA ACCGGTGGCT 1680 

GGTCCCCTCC GCCGCCCCCA TTACAAGGCT GGCAAAGGGA GGGGGCGGGG CCTGGGACGT 1740 
GGTCCAATGA GTACGCGCGC CGGGGCGGCG GGGGCGGGGC CGGGCGCGCA GCGCAGGGCC 1800 
GGGCGGCCGA GGCTCCAATG AGCGCCCGCC GCGTCCGGGG CCGGCTGGTG CGCGAGACGC 1860 
CGCCGAGAGG TTGGTGGCTA ATGTAACAGT TTGCAAACCG AGAGGAGTTG TGAAGGGCGC 1920 
GGGTGGGGGG CGCTGCCGGC CTCGTGGGTA CGTTCGTGCC GCGTCTGTCC CAGAGCTGGG 1980 

35 

GCCGCAGGAG CGGAGGCAAG AGGTAGCGGG GGTGGATGGA GGTGCGGGCC GGCCACCCCT 2040 
CCTAGGGGAG ACAGCGTGCG AGCTCCGGGG GCGGGTCGGG AGCGCAAGGG AGGGCCGCGC 2100 
40 GGACGCCGGG CGCTCGGCCT CGCACCGGGG GGCACGCAGC TCGGCCCCCG GTCTGTCCCC 2160 

ACTTGCTGGG GCGGGCCGGG ATCCGTTTCC GGGAGTGGGA GCCGCCGCCT TCGTCAGGTG 2220 
GGGTTTAGGT GAACACCGGG TAACGGCTAC CCGCCGGGCG GGGAACCTTA CCGCCCCTGG 2280 
45 CACTGCGTCT GTGGGCACAG CGGGGCCGGG GAGTGAGCTG GGAAAGGGGA GGGGGCGGGA 2340 

CAACCCGCAG GGATGCCGAG GAGGAGATAG GCCTTTCCTT CATCCTAGCT ACCCCCAACG 2400 
TCATTACCTT TCTCTTCCCG TCCAGGCCCA GCTGGCTTTC CCCGTCAGCG GGGGAGCTCC 2460 
AGGTGTGGGG AGGTGGTTGA GCCCTGGGCG GGGATCCCTG GCCGCACCCC AGGTGTCTGA 2520 
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CAACAGGCAC AGTGCTGCGG TGCGCCACTC ACTGCCTGTG TGGTGGACAA AAGGCTCGGG 2580 
TCTCCTTTCT CTTGTCCTGT TAGCTTCTCT GTTTAGGGAT GTGGCAAAGC CGAGGACCCA 2640 
5 TGCTCTTTCA CTTGGGCCTT TGTGTGGGCG CTGCTGGGAT GATTAGAGAA TGGTTTGTAC 2700 

CCATCAGGAG GGAGAAGGGG AGAAGTAGGC TGATCTGCCC TGGGTAAGAA TGAAGTAGAT 2760 
ATGAATCTTA CAGCCTCTCC GTTCTGGGAT GTGATTCTGT CTCCTTCACT CCGGGTATCC 2820 

10 

AGTTTTAAGT GTTTTCTTTC TTCGCCTCCC CCAGGGGCAC T 2861 

75 
20 
25 
30 
35 
40 
45 
SO 
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SEQUENCE LISTING 
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15 



(1) GENERAL INFORMATION: 
(i) APPLICANT: 

(A) NAME: HSP Research Institute, Inc. 

(B) STREET: 2-8, Doshomachi 2-chome, Chuo-ku, 

(C) CITY: Osaka- shi, Osaka 

(E) COUNTRY: JP 

(F) POSTAL CODE (ZIP) : none 

(ii) TITLE OP INVENTION: STRESS PROTEINS 
(iii) NUMBER OF SEQUENCES: 12 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 96 12 0622.0 
(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 7-349661 

(B) FILING DATE: 20 -DEC- 1995 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 8-213181 

(B) FILING DATE: 23-JUL-1996 



30 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



20 



25 



35 



40 



45 



SO 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met Ala Asp Lys Val Arg Arg Gin Arg Pro Arg Arg Arg Val Cys Trp 
1 5 10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 

20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 
35 40 45 

He Val Lys Pro Gly Val Pro Met Glu He Val Leu Asn Lys Glu Ser 
50 55 60 

Arg Arg Lys Thr Pro Val He Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 70 75 80 

Phe Gly Asp Ser Ala Ala Ser Met Ala He Lys Asn Pro Lys Ala Thr 

85 90 95 
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Leu Arg Tyr Phe Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His 

100 105 HO 

Val Ala Leu Tyr Gin Ala Arg Phe Pro Glu His Glu Leu Thr Phe Asp 
115 120 125 

Pro Gin Arg Gin Thr Val His Phe Gin He Ser Ser Gin Leu Gin Phe 
130 135 140 

Ser Pro Glu Glu Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu 
145 150 155 160 

Ala Glu Asp Phe Ala Glu Gin Pro He Lys Asp Ala Val lie Thr Val 

165 170 175 

Pro Val Phe Phe Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala 

180 185 190 

Arg Met Ala Gly Leu Lys Val Leu Gin Leu He Asn Asp Asn Thr Ala 
195 200 205 

Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg Lys Asp He Asn Thr Thr 
210 215 220 

Ala Gin Asn He Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys 
225 230 235 240 

Thr He Val Thr Tyr Gin Met Val Lys Thr Lys Glu Ala Gly Met Gin 

245 250 255 

Pro Gin Leu Gin He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly 

260 265 270 

Leu Glu Met Glu Leu Arg Leu Arg Glu Arg Leu Ala Gly Leu Phe Asn 
275 280 285 

Glu Gin Arg Lys Gly Gin Arg Ala Lys Asp Val Arg Glu Asn Pro Arg 
290 295 300 

Ala Met Ala Lys Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu 
305 310 315 320 

Ser Ala Asn Ala Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp 

325 330 335 

Val Asp Phe Lys Ala Lys Val' Thr Arg Val Glu Phe Glu Glu Leu Cys 

340 *" 345 350 

Ala Asp Leu Phe Glu Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin 
355 360 365 

Ser Ala Glu Met Ser Leu Asp Glu He Glu Gin Val He Leu Val Gly 
370 375 380 

Gly Ala Thr Arg Val Pro Arg Val Gin Glu val Leu Leu Lys Ala Val 
385 390 395 400 

Gly Lys Glu Glu Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala 

405 410 415 
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Met Gly Ala Val Tyr Gin Ala Ala Ala Leu Ser Lys Ala Phe Lys Val 

420 425 430 

5 Lys Pro Phe Val Val Arg Asp Ala val Val Tyr Pro lie Leu Val Glu 

435 440 445 

Phe Thr Arg Glu Val Glu Glu Glu Pro Gly lie His Ser Leu Lys His 
450 455 460 

10 Asn Lys Arg Val Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys 

465 470 475 480 

Val He Thr Phe Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn 

485 490 495 

95 Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly 

500 505 510 

Ser Gin Asn Leu Thr Thr Val Lys Leu Lys Gly Val Gly Asp Ser Phe 
515 520 525 

Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly lie Lys Ala His Phe Asn 
530 535 540 

Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe 
545 550 555 560 

Glu Thr Leu Val Glu Asp Ser Ala Glu Glu Glu Ser Thr Leu Thr Lys 

565 570 575 

Leu Gly Asn Thr He Ser Ser Leu Phe Gly Gly Gly Thr Thr Pro Asp 

580 585 590 

Ala Lys Glu Asn Gly Thr Asp Thr Val Gin Glu Glu Glu Glu Ser Pro 
595 600 605 

Ala Glu Gly Ser Lys Asp Glu Pro Gly Glu Gin Val Glu Leu Lys Glu 
610 615 620 

Glu Ala Glu Ala Pro Val Glu Asp Gly Ser Gin Pro Pro Pro Pro Glu 
625 630 635 640 

Pro Lys Gly Asp Ala Thr Pro Glu Gly Glu Lys Ala Thr Glu Lys Glu 

645 650 655 

Asn Gly Asp Lys Ser Glu Ala Gin Lys Pro Ser Glu Lys Ala Glu Ala 

660 * 665 670 

Gly Pro Glu Gly Val Ala Pro Ala Pro Glu Gly Glu Lys Lys Gin Lys 
45 675 680 68S 

Pro Ala Arg Lys Arg Arg Met Val Glu Glu He Gly Val Glu Leu Val 
690 695 700 

Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Lys Leu Ala Gin Ser Val 
50 7 0 5 7 10 7 15 7 2 0 

Gin Lys Leu Gin Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 

725 730 735 
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Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe lie Phe Glu Thr Gin Asp 

740 745 750 

Lys Leu Tyr Gin Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg 
° 755 760 765 

Glu Glu lie Ser Gly Lys Leu Ser Ala Ala Ser Thr Trp Leu Glu Asp 
770 775 780 

w Glu Gly Val Gly Ala Thr Thr Val Met Leu Lys Glu Lys Leu Ala Glu 

785 790 795 800 

Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Lys 

805 810 815 

15 L y S Trp p r o Glu Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser 

820 825 830 

Ser Met Phe Leu Lys Gly Ala Arg Leu lie Pro Glu Met Asp Gin lie 
835 840 845 

Phe Thr Glu Val Glu Met Thr Thr Leu Glu Lys Val lie Asn Glu Thr 
850 855 860 

Trp Ala Trp Lys Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala 
865 870 875 880 

Thr Glu Lys Pro Val Leu Leu Ser Lys Asp He Glu Ala Lys Met Met 

885 890 895 

Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu Asn Lys Ala Lys Phe Thr 

900 905 910 

Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn Gly Thr Arg Ala Glu Pro 
915 920 925 

Pro Leu Asn Ala Ser Ala Ser Asp Gin Gly Glu Lys Val He Pro Pro 
930 935 940 

Ala Gly Gin Thr Glu Asp Ala Glu Pro He Ser Glu Pro Glu Lys Val 
945 950 955 960 

Glu Thr Gly Ser Glu Pro Gly Asp Thr Glu Pro Leu Glu Leu Gly Gly 
40 965 970 975 

Pro Gly Ala Glu Pro Glu Gin Lys Glu Gin Ser Thr Gly Gin Lys Arg 

980 985 990 



25 



30 



35 



45 



50 



Pro Leu Lys Asn Asp Glu Leu 
995 

(2) INFORMATION FOR SBQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4503 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE : 

(A) NAME /KEY ; CDS 

(B) LOCATION : 103 . . 3099 



10 



15 



20 



25 



30 



45 



50 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTGTGAAGGQ CGCGGGTGGG GGGCGCTGCC GGCCTCGTGG GTACGTTCGT GCCGCGTCTG 60 

TCCCAGAGCT GGGGCCGCAG GAGCGGAGGC AAGAGGGGCA CT ATG GCA GAC AAA 114 

Met Ala Asp Lys 
1 

GTT AGG AGG CAG AGG CCG AGG AGG CGA GTC TGT TGG GCC TTG GTG GCT 162 
Val Arg Arg Gin Arg Pro Arg Arg Arg Val Cys Trp Ala Leu Val Ala 
5 10 15 20 

GTG CTC TTG GCA GAC CTG TTG GCA CTG AGT GAT ACA CTG GCA GTG ATG 210 
Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr Leu Ala Val Met 

25 30 35 

TCT GTG GAC CTG GGC AGT GAG TCC ATG AAG GTG GCC ATT GTC AAA CCT 258 
Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala lie Val Lys Pro 

40 45 50 

GGA GTG CCC ATG GAA ATT GTC TTG AAT AAG GAA TCT CGG AGG AAA ACA 3 06 

Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser Arg Arg Lys Thr 
55 60 65 

CCG GTG ATC GTG ACC CTG AAA GAA AAT GAA AGA TTC TTT GGA GAC AGT 354 
Pro Val lie Val Thr Leu Lys Glu Asn Glu Arg Phe Phe Gly Asp Ser 

70 75 80 



GCA GCA AGC ATG GCG ATT AAG AAT CCA AAG GCT ACG CTA CGT TAC TTC 402 
Ala Ala Ser Met Ala lie Lys Asn Pro Lys Ala Thr Leu Arg Tyr Phe 
35 85 90 95 100 

CAG CAC CTC CTG GGG AAG CAG GCA GAT AAC CCC CAT GTA GCT CTT TAC 450 
Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His Val Ala Leu Tyr 

105 110 115 

40 CAG GCC CGC TTC CCG GAG CAC GAG CTG ACT TTC GAC CCA CAG AGG CAG 498 

Gin Ala Arg Phe Pro Glu His Glu Leu Thr Phe Asp Pro Gin Arg Gin 

120 125 130 



ACT GTG CAC TTT CAG ATC AGC TCG CAG CTG CAG TTC TCA CCT GAG GAA 546 
Thr Val His Phe Gin lie Ser Ser Gin Leu Gin Phe Ser Pro Glu Glu 
135 140 145 

GTG TTG GGC ATG GTT CTC AAT TAT TCT CGT TCT CTA GCT GAA GAT TTT 594 
Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu Ala Glu Asp Phe 
150 155 160 

GCA GAG CAG CCC ATC AAG GAT GCA GTG ATC ACC GTG CCA GTC TTC TTC 642 
Ala Glu Gin Pro lie Lys Asp Ala val lie Thr Val Pro val Phe Phe 

165 170 175 180 
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AAC CAG GCC GAG CGC CGA GCT GTG CTG CAG GCT GCr CGT ATG GOT GGC 
Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala Arg Met Ala Gly 

185 190 195 

CTC AAA GTG CTG CAG CTC ATC AAT GAC AAC ACC GCC ACT GCC CTC AGC 
Leu Lys Val Leu Gin Leu lie Asn Asp Asn Thr Ala Thr Ala Leu Ser 

200 205 210 

TAT GGT GTC TTC CGC CGG AAA GAT ATT AAC ACC ACT GCC CAG AAT ATC 
Tyr Gly val Phe Arg Arg Lys Asp lie Asn Thr Thr Ala Gin Asn lie 

215 220 225 

ATG TTC TAT GAC ATG GGC TCA GGC AGC ACC GTA TGC ACC ATT GTG ACC 
Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys Thr lie Val Thr 
230 235 240 

TAC CAG ATG GTG AAG ACT AAG GAA GCT GGG ATG CAG CCA CAG CTG CAG 
Tyr Gin Met Val Lys Thr Lys Glu Ala Gly Met Gin Pro Gin Leu Gin 
245 250 255 260 

ATC CGG GGA GTA GGA TTT GAC CGT ACC CTG GGG GGC CTG GAG ATG GAG 
He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly Leu Glu Met Glu 

265 270 275 

CTC CGG CTT CGA GAA CGC CTG GCT GGG CTT TTC AAT GAG CAG CGC AAG 
Leu Arg Leu Arg Glu Arg Leu Ala Gly Leu Phe Asn Glu Gin Arg Lys 

280 285 290 

GGT CAG AGA GGA AAG GAT GTG CGG GAG AAC CCG CGT GCC ATG GCC AAG 
Gly Gin Arg Ala Lys Asp Val Arg Glu Asn Pro Arg Ala Met Ala Lys 
295 300 305 

CTG CTG CGT GAG GCT AAT CGG CTC AAA ACC GTC CTC AGT GCC AAC GCT 
Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu Ser Ala Asn Ala 
310 315 320 

GAC CAC ATG GCA CAG ATT GAA GGC CTG ATG GAT GAT GTG GAC TTC AAG 
Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp Val Asp Phe Lys 
325 330 335 340 

GCA AAA GTG ACT CGT GTG GAA TTT GAG GAG TTG TGT GCA GAC TTG TTT 
Ala Lys Val Thr Arg Val Glu Phe Glu Glu Leu Cys Ala Asp Leu Phe 

345 350 355 

GAG CGG GTG CCT GGG CCT GTA CAG CAG GCC CTC CAG AGT GCC GAA ATG 
Glu Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin Ser Ala Glu Met 

360 365 370 

AGT CTG GAT GAG ATT GAG CAG GTG ATC CTG GTG GGT GGG GCC ACT CGG 
Ser Leu Asp Glu He Glu Gin Val He Leu Val Gly Gly Ala Thr Arg 
375 380 385 

GTC CCC AGA GTT CAG GAG GTG CTG CTG AAG GCC GTG GGC AAG GAG GAG 
Val Pro Arg Val Gin Glu Val Leu Leu Lys Ala Val Gly Lys Glu Glu 
390 395 400 

CTG GGG AAG AAC ATC AAT GCA GAT GAA GCA GCC GCC ATG GGG GCA GTG 
Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala Met Gly Ala Val 
405 410 415 420 
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TAC CAG GCA GCT GCG CTC AGC AAA GCC TTT AAA GTG AAG CCA ViT GTC 1410 
Tyr Gin Ala Ala Ala Leu Ser Lys Ala Phe Lys Val Lys Pro Phe Val 

425 430 435 

GTC CGA GAT GCA GTG GTC TAC CCC ATC CTG GTG GAG TTC ACG AGG GAG 1458 
Val Arg Asp Ala Val Val Tyr Pro lie Leu Val Glu Phe Thr Arg Glu 

440 445 450 

GTG GAG GAG GAG CCT GGG ATT CAC AGC CTG AAG CAC AAT AAA CGG GTA 1506 
10 Val Glu Glu Glu Pro Gly lie His Ser Leu Lys His Asn Lys Arg Val 

4S5 460 465 

CTC TTC TCT CGG ATG GGG CCC TAC CCT CAA CGC AAA GTC ATC ACC TTT 1554 
Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys Val He Thr Phe 
470 475 480 

15 

AAC CGC TAC AGC CAT GAT TTC AAC TTC CAC ATC AAC TAC GGC GAC CTG 1602 
Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn Tyr Gly Asp Leu 
485 490 495 500 

GGC TTC CTG GGG CCT GAA GAT CTT CGG GTA TTT GGC TCC CAG AAT CTG 1650 
Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly Ser Gin Asn Leu 

505 510 515 

ACC ACA GTG AAG CTA AAA GGG GTG GGT GAC AGC TTC AAG AAG TAT CCT 1698 
Thr Thr Val Lys Leu Lys Gly Val Gly Asp Ser Phe Lys Lys Tyr Pro 

520 525 530 

GAC TAC GAG TCC AAG GGC ATC AAG GCT CAC TTC AAC CTG GAT GAG AGT 1746 
Asp Tyr Glu Ser Lys Gly He Lys Ala His Phe Asn Leu Asp Glu Ser 
535 540 545 

30 GGC GTG CTC AGT CTA GAC AGG GTG GAG TCT GTA TTT GAG ACA CTG GTA 1794 

Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe Glu Thr Leu Val 
550 555 560 



20 



25 



35 



40 



50 



GAG GAC AGC GCA GAA GAG GAA TCT ACT CTC ACC AAA CTT GGC AAC ACC 
Glu Asp Ser Ala Glu Glu Glu Ser Thr Leu Thr Lys Leu Gly Asn Thr 
565 570 575 580 

ATT TCC AGC CTG TTT GGA GGC GGT ACC ACA CCA GAT GCC AAG GAG AAT 1890 
lie Ser Ser Leu Phe Gly Gly Gly Thr Thr Pro Asp Ala Lys Glu Asn 

5S5 590 595 

GGT ACT GAT ACT GTC CAG GAG GAA GAG GAG AGC CCT GCA GAG GGG AGC 1938 
Gly Thr Asp Thr Val Gin Glu Glu Glu Glu Ser Pro Ala Glu Gly Ser 

600 605 610 

AAG GAC GAG CCT GGG GAG CAG GTG GAG CTC AAG GAG GAA GCT GAG GCC 1986 
45 Lys Asp Glu Pro Gly Glu Gin Val Glu Leu Lys Glu Glu Ala Glu Ala 

615 620 625 

CCA GTG GAG GAT GGC TCT CAG CCC CCA CCC CCT GAA CCT AAG GGA GAT 2034 
Pro Val Glu Asp Gly Ser Gin Pro Pro Pro Pro Glu Pro Lys Gly Asp 
630 635 640 



GCA ACC CCT GAG GGA GAA AAG GCC ACA GAA AAA GAA AAT GGG GAC AAG 2082 
Ala Thr Pro Glu Gly Glu Lys Ala Thr Glu Lys Glu Asn Gly Asp Lys 
645 650 655 660 
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TCT GAG GCC CAG AAA CCA AGT GAG AAG GCA GAG GCA GGG CCT GAG GGC 2130 
Ser Glu Ala Gin Lys Pro Ser Glu Lys Ala Glu Ala Gly Pro Glu <3ly 

665 670 675 

5 GTC GCT CCA GCC CCA GAG GGA GAG AAG AAG CAG AAG CCC GCC AGG AAG 2178 

Val Ala Pro Ala Pro Glu Gly Glu Lys Lys Gin Lys Pro Ala Arg Lys 

680 685 690 

CGG CGA ATG GTA GAG GAG ATC GGG GTG GAG CTG GTT GTT CTG GAC CTG 2226 
10 Arg Arg Met Val Glu Glu lie Gly Val Glu Leu Val Val Leu Asp Leu 

6 95 700 705 

CCT GAC TTG CCA GAG GAT AAG CTG GCT CAG TCG GTG CAG AAA CTT CAG 2274 

Pro Asp Leu Pro Glu Asp Lys Leu Ala Gin Ser Val Gin Lys Leu Gin 
710 715 720 

15 

GAC TTG ACA CTC CGA GAC CTG GAG AAG CAG GAA CGG GAA AAA GCT GCC 2322 

Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg Glu Lys Ala Ala 
725 730 735 740 

AAC AGC TTG GAA GCG TTC ATA TTT GAG ACC CAG GAC AAG CTG TAG CAG 2370 
20 Asn Ser Leu Glu Ala Phe lie Phe Glu Thr Gin Asp Lys Leu Tyr Gin 

745 750 755 

CCC GAG TAC CAG GAA GTG TCC ACA GAG GAG CAG CGT GAG GAG ATC TCT 2418 
Pro Glu Tyr Gin Glu Val Ser Thr Glu Glu Gin Arg Glu Glu lie Ser 

760 765 770 

25 

GGG AAG CTC AGC GCC GCA TCC ACC TGG CTG GAG GAT GAG GGT GTT GGA 2466 
Gly Lys Leu Ser Ala Ala Ser Thr Trp Leu Glu Asp Glu Gly Val Gly 
775 780 785 

GCC ACC ACA GTG ATG TTG AAG GAG AAG CTG GCT GAG CTG AGG AAG CTG 2514 
Ala Thr Thr Val Met Leu Lys Glu Lys Leu Ala Glu Leu Arg Lys Leu 
790 795 800 

TGC CAA GGG CTG TTT TTT CGG GTA GAG GAG CGC AAG AAG TGG CCC GAA 2562 
Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Lys Lys Trp Pro Glu 
35 805 810 815 820 

CGG CTG TCT GCC CTC GAT AAT CTC CTC AAC CAT TCC AGC ATG TTC CTC 2610 
Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser Ser Met Phe Leu 

825 830 835 

40 AAG GGG GCC CGG CTC ATC CCA GAG ATG GAC CAG ATC TTC ACT GAG GTG 2658 

Lys Gly Ala Arg Leu He Pro Glu Met Asp Gin lie Phe Thr Glu Val 

840 845 850 

GAG ATG ACA ACG TTA GAG AAA GTC ATC AAT GAG ACC TGG GCC TGG AAG 2706 
Glu Met Thr Thr Leu Glu Lys Val He Asn Glu Thr Trp Ala Trp Lys 
45 8 5 5 8 6 0 8 65 

AAT GCA ACT CTG GCC GAG CAG GCT AAG CTG CCC GCC ACA GAG AAG CCT 2754 
Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala Thr Glu Lys Pro 
870 875 880 



30 



SO 



GTG TTG CTC TCA AAA GAC ATT GAA GCT AAG ATG ATG GCC CTG GAC CGA 2802 
Val Leu Leu Ser Lys Asp He Glu Ala Lys Met Met Ala Leu Asp Arg 
885 890 895 900 
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GAG 


GTG 


CAG 


TAT 


CTG 


CTC 




Glu 


Val 


Gin 


Tyr 


Leu 


Leu 












905 




5 


CGG 


CCT 


AAG 


GAC 


AAG 


AAT 




Arg 


Pro 


Lys 


Asp 


Lys 


Asn 










920 








AGT 


GCC 


AGT 


GAC 


CAG 


GGG 




Ser 


Ala 


Ser 


ASD 


Gin 


Gly 








935 










GAA 


GAT 


GCA 


GAG 


CCC 


ATT 




Glu Asp Ala 


Glu 


Pro 


lie 






950 










75 
















GAG 


CCA 


GGA 


GAC 


ACT 


GAG 




Glu 


Pro 


Gly 


Asp 


Thr 


Glu 




965 










970 


20 


CCT 


GAA 


CAG 


AAA 


GAA 


GAA 


Pro 


Glu 


Gin 


Lys 


Glu 


Gin 












985 





910 915 



925 930 



940 945 



955 960 



975 980 



990 995 



2850 



2898 



2946 



2994 



3042 



3090 



GAC GAA CTA TAACCCCCAC CTCTGTTTTC CCCATTCATC TCCACCCCCT 3139 
Asp Glu Leu 



TCCCCCACCA CTTC T ATTTA TTTAACATCG AGGGTTGGGG GAGGGGTTGG TCCTGCCCTC 3199 

GGCTGGAGTT CCTTTCTCAC CCCTGTGATT TGGAGGTGTG GAGAAGGGGA AGGGAGGGAC 3259 

30 AGCTCACTGG TTCCTTCTGC AGTACCTCTG TGGTTAAAAA TGGAAACTGT TCTCCTCCCC 3319 

AGCCCCACTC CCTGTTCCCT AC C CAT AT AG GCCCTAAATT TGGGAAAAAT CACTATTAAT 3379 

TTCTGAATCC TTTGCCTGTG GGTAGGAAGA GAATGGCTGC CAGTGGCTGA TGGGTCCCGG 3439 

35 TGATGGGAAG GGTATCAGGT TGCTGGGGAG TTTCCACTCT TCTCTGGTGA TTGTTCCTTC 3499 

CCTCCCTTCC TCTCCCACCA TGCGATGAGC ATCCTTTCAG GCCAGTGTCT GCAGAGCCTC 3559 

AGTTACCAGG TTTGGTTTCT GAGTGCCTAT CTGTGCTCTT TCCTCCCTCT GCGGGCTTCT 3619 



CTTGCTCTGA GCCTCCCTTC CCCATTCCCA TGCAGCTCCT TTCCCCCTGG GTTTCCTTGG 3679 

CTTCCTGCAG CAAATTGGGC AGTTCTCTGC CCCTTGCCTA AAAGCCTGTA CCTCTGGATT 3739 

GGCGGAAGTA AATCTGGAAG GATTCTCACT CGTATTTCCC ACCCCTAGTG GCCAGAGGAG 3799 

GGAGGGGCAC AGTGAAGAAG GGAGCCCACC ACCTCTCCGA AGAGGAAAGC CACGTAGAGT 3859 

GGTTGGCATG GGGTGCCAGC ATCGTGCAAG CTCTGTCATA ATCTGCATCT TCCCAGCAGC 3919 

CTGGTACCCC AGGTTCCTGT AACTCCCTGC CTCCTCCTCT CTTCTGCTGT TCTGCTCCTC 3979 

CCAGACAGAG CCTTTCCCTC ACCCCCTGAC CCCCTGGGCT GACCAAAATG TGCTTTCTAC 4039 

TGTGAGTCCC TATCCCAAGA TCCTGGGGAA AGGAGAGACC ATGGTGTGAA TGTAGAGATG 4099 
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CCACCTCCCT CTCTCTGAGG CAGGCCTGTG GATGAAGGAG GAGGGTCAGG GCTGGCCTTC 415 9 

> 

CTCTGTGCAT CACTCTGCTA GGTTGGGGGC CCCCGACCCA CCATACCTAC GCCTAGGGAG 4219 

rrCGTCCTCC AGTATTCCGT CTGTAGCAGG AGCTAGGGCT GCTGCCTCAG CTCCAAGACA 427 9 

AGAATGAACC TGGCTGTTGC AGTCATTTTG TCTTTTCCTT TTTTTTTTTT TGC CACATTG 4339 

'3CAGAGATGG GACCTAAGGG TCCCACCCCT CACCCCACCC CCACCTCTTC TGTATGTTTG 43 99 

AATTCTTTCA GTAGCTGTTG ATGCTGGTTG GACAGGTTTG AGTCAAATTG TACTTTGCTC 4459 

CA TTGTTA AT TGAGAAACTG TTTCAATAAA ATATTCTTTT CTAC 4503 

J5 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



20 



25 



30 



35 



AO 



45 



SO 



Met Ala Ala Thr Val Arg Arg Gin Arg Pro Arg Arg Leu Leu Cys Trp 
15 10 15 

Ala Leu Val Ala Val Leu Leu Ala Asp Leu Leu Ala Leu Ser Asp Thr 

20 25 30 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 
35 40 45 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu Ser 
50 55 60 

Arg Arg Lys Thr Pro Val Thr Val Thr Leu Lys Glu Asn Glu Arg Phe 
65 70 75 80 

Leu Gly Asp Ser Ala Ala Gly Met Ala lie Lys Asn Pro Lys Ala Thr 

85 90 95 

Leu Arg Tyr Phe Gin His Leu Leu Gly Lys Gin Ala Asp Asn Pro His 

100 105 110 

Val Ala Leu Tyr Arg Ser Arg Phe Pro Glu His Glu Leu Asn Val Asp 
115 " 120 125 

Pro Gin Arg Gin Thr Val Arg Phe Gin lie Ser Pro Gin Leu Gin Phe 
130 135 140 

Ser Pro Glu Glu Val Leu Gly Met Val Leu Asn Tyr Ser Arg Ser Leu 
145 150 155 160 

Ala Glu Asp Phe Ala Glu Gin Pro lie Lys Asp Ala Val lie Thr Val 

165 170 17S 

Pro Ala Phe Phe Asn Gin Ala Glu Arg Arg Ala Val Leu Gin Ala Ala 

180 185 190 
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Arg Met Ala Gly Leu Lys Val Leu Gin Leu lie Asn Asp Asn Thr Ala 
195 200 205 

Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg Lys Asp He Asn Ser Thr 
210 215 220 

Ala Gin Asn He Met Phe Tyr Asp Met Gly Ser Gly Ser Thr Val Cys 
225 230 235 240 

Thr lie Val Thr Tyr Gin Thr Val Lys Thr Lys Glu Ala Gly Thr Gin 

245 250 255 

Pro Gin Leu Gin He Arg Gly Val Gly Phe Asp Arg Thr Leu Gly Gly 

260 265 270 

75 Leu Glu Met Glu Leu Arg Leu Arg Glu His Leu Ala Lys Leu Phe Asn 

275 280 285 

Glu Gin Arg Lys Gly Gin Lys Ala Lys Asp Val Arg Glu Asn Pro Arg 
290 295 300 



10 
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35 
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45 



Ala Met Ala Lys Leu Leu Arg Glu Ala Asn Arg Leu Lys Thr Val Leu 
305 310 315 320 

Ser Ala Asn Ala Asp His Met Ala Gin He Glu Gly Leu Met Asp Asp 

325 330 335 

Val Asp Phe Lys Ala Lys Val Thr Arg Val Glu Phe Glu Glu Leu cys 

340 345 350 

Ala Asp Leu Phe Asp Arg Val Pro Gly Pro Val Gin Gin Ala Leu Gin 
355 360 



30 

Ser Ala Glu Met Ser Leu Asp Gin He Glu Gin Val He Leu Val Gly 
370 375 380 

Gly Pro Thr Arg Val Pro Lys Val Gin Glu Val Leu Leu Lys Pro Val 
385 390 395 400 

Gly Lys Glu Glu Leu Gly Lys Asn He Asn Ala Asp Glu Ala Ala Ala 

405 410 415 

Met Gly Ala Val Tyr Gin. Ala Ala Ala Leu Ser Lys Ala Phe Lys Val 

420 425 430 

Lys Pro Phe Val Val Arg Asp Ala Val He Tyr Pro He Leu Val Glu 
435 440 445 

Phe Thr Arg Glu Val Glu Glu Glu Pro Gly Leu Arg Ser Leu Lys His 
450 455 460 

Asn Lys Arg Val Leu Phe Ser Arg Met Gly Pro Tyr Pro Gin Arg Lys 
465 470 475 480 

Val He Thr Phe Asn Arg Tyr Ser His Asp Phe Asn Phe His He Asn 
50 485 490 495 

Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu Asp Leu Arg Val Phe Gly 

500 505 510 
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Ser Gin Asn Leu Thr Thr Val Lys Leu Lys Gly Val Gly Glu Ser Phe 
515 520 525 

Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly lie Lys Ala His Phe Asn 
530 535 540 

Leu Asp Glu Ser Gly Val Leu Ser Leu Asp Arg Val Glu Ser Val Phe 
545 550 555 560 

Glu Thr Leu Val Glu Asp Ser Pro Glu Glu Glu Ser Thr Leu Thr Lys 

565 570 575 

Leu Gly Asn Thr lie Ser Ser Leu Phe Gly Gly Gly Thr Ser Ser Asp 

580 585 590 

75 Ala Lys Glu Asn Gly Thr Asp Ala Val Gin Glu Glu Glu Glu Ser Pro 

595 600 605 

Ala Glu Gly Ser Lys Asp Glu Pro Ala Glu Gin Gly Glu Leu Lys Glu 
610 615 620 

20 Glu Ala Glu Ala Pro Met Glu Asp Thr Ser Gin Pro Pro Pro Ser Glu 

625 630 635 640 

Pro Lys Gly Asp Ala Ala Arg Glu Gly Glu Thr Pro Asp Glu Lys Glu 

645 650 655 



25 



30 



35 



40 



45 



SO 



Ser Gly Asp Lys Ser Glu Ala Gin Lys Pro Asn Glu Lys Gly Gin Ala 

660 665 670 

Gly Pro Glu Gly Val Pro Pro Ala Pro Glu Glu Glu Lys Lys Gin Lys 
675 680 685 

Pro Ala Arg Lys Gin Lys Met Val Glu Glu He Gly Val Glu Leu Ala 
690 695 700 

Val Leu Asp Leu Pro Asp Leu Pro Glu Asp Glu Leu Ala His Ser Val 
705 710 715 720 

Gin Lys Leu Glu Asp Leu Thr Leu Arg Asp Leu Glu Lys Gin Glu Arg 

725 730 735 

Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe He Phe Glu Thr Gin Asp 

740 745 750 

Lys Leu Tyr Gin Pro Glu Tyr "Gin Glu Val Ser Thr Glu Glu Gin Arg 
755 760 765 

Glu Glu He Ser Gly Lys Leu Ser Ala Thr Ser Thr Trp Leu Glu Asp 
770 775 780 

Glu Gly Phe Gly Ala Thr Thr Val Met Leu Lys Asp Lys Leu Ala Glu 
785 790 795 800 

Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe Arg Val Glu Glu Arg Arg 

805 810 815 

Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp Asn Leu Leu Asn His Ser 

820 825 830 
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Ser lie Phe Leu Lys Gly Ala Arg Leu lie Pro Glu Met Asp Gin lie 
835 840 845 

Phe Thr Asp Val Glu Met Thr Thr Leu Glu Lys Val lie Asn Asp Thr 

8b0 855 860 

Trp Thr Trp Lys Asn Ala Thr Leu Ala Glu Gin Ala Lys Leu Pro Ala 
94$ 870 875 880 

Thr Glu Lys Pro Val Leu Leu Ser Lys Asp lie Glu Ala Lys Met Met 

885 890 895 

Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu Asn Lys Ala Lys Phe Thr 

900 905 910 

Lya Pre Arg Pro Arg Pro Lys Asp Lys Asn Gly Thr Arg Thr Glu Pro 
915 920 925 

Pro Leu Asn Ala Ser Ala Gly Asp Gin Glu Glu Lys Val lie Pro Pro 
930 935 940 

Thr Gly Gin Thr Glu Glu Ala Lys Ala lie Leu Glu Pro Asp Lys Glu 
945 950 955 960 

Gly Leu Gly Thr Glu Ala Ala Asp Ser Glu Pro Leu Glu Leu Gly Gly 
25 965 970 975 

Pro Gly Ala Glu Ser Glu Gin Ala Glu Gin Thr Ala Gly Gin Lys Arg 

980 985 990 



15 



20 



30 



Pro Leu Lys Asn Asp Glu Leu 
995 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 3252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



40 



45 



50 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 203 . .3199 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TGAGGATGGA GCAGCGGTCG GGCCGCGGCT CCTAGGGGAG GCAGCGTGCT AGCTTCGGGG 60 

GCGGGCCAGT AGCGGGAGCG AGGGCCGTAC GGACACCGGT CCCTTCGGCC TTGAAGTTCA 120 

GGCGCTGAGC TGCCCCCTCG CGCTCGGGGT GGGCCGGAAT CCATTTCTGG GAGTGGGATC 180 
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TTCCACCTTC ATCAGGGTCA CA ATG GCA GCT ACA GTA AGG AGG CAG AGG CCA 232 

Met Ala Ala Thr Val Arg Arg Gin Arg* Pro 

5 10 

5 

AGG AGG CTA CTC TGT TGG GCC TTG GTG GCT GTC CTC TTG GCA GAC CTG 280 
Arg Arg Leu Leu Cys Trp Ala Leu Val Ala Val Leu Leu Ala Asp Leu 

15 20 25 

TTG GCA CTG AGT GAC ACA CTG GCT GTG ATG TCT GTG GAC CTG GGC AGT 328 
W Leu Ala Leu Ser Asp Thr Leu Ala Val Met Ser Val Asp Leu Gly Ser 

30 35 40 

GAA TCC ATG AAG GTG GCC ATT GTC AAG CCT GGA GTG CCC ATG GAG ATT 3 76 

Glu Ser Met Lys Val Ala lie Val Lys Pro Gly Val Pro Met Glu lie 
45 50 55 



35 



50 



GTA TTG AAC AAG GAA TCT CGG AGG AAA ACT CCG GTG ACT GTG ACC TTG 424 
Val Leu Asn Lys Glu Ser Arg Arg Lys Thr Pro Val Thr Val Thr Leu 
60 65 70 

AAG GAA AAC GAA AGG TTT CTA GGT GAC AGT GCA GCT GGC ATG GCC ATC 472 
Lys Glu Asn Glu Arg Phe Leu Gly Asp Ser Ala Ala Gly Met Ala lie 
75 80 85 90 



AAG AAC CCA AAG GCT ACG CTC CGT TAT TTC CAG CAC CTC CTT GGA AAG S20 
Lys Asn Pro Lys Ala Thr Leu Arg Tyr Phe Gin His Leu Leu Gly Lys 
P5 95 100 105 

CAG GCA GAT AAC CCT CAT GTG GCT CTT TAC CGG TCC CGT TTC CCA GAA 568 
Gin Ala Asp Asn Pro His Val Ala Leu Tyr Arg Ser Arg Phe Pro Glu 

110 115 120 

30 CAT GAG CTC AAT GTT GAC CCA CAG AGG CAG ACT GTG CGC TTC CAG ATC 616 

His Glu Leu Asn Val Asp Pro Gin Arg Gin Thr Val Arg Phe Gin lie 
125 130 135 



AGT CCG CAG CTG CAG TTC TCT CCC GAG GAG GTG CTG GGC ATG GTT CTC 664 
Ser Pro Gin Leu Gin Phe Ser Pro Glu Glu Val Leu Gly Met Val Leu 
140 145 150 

AAC TAC TCC CGT TCC CTG GCT GAA GAT TTT GCA GAA CAA CCT ATT AAG 712 
Asn Tyr Ser Arg Ser Leu Ala Glu Asp Phe Ala Glu Gin Pro He Lys 

160 165 170 



40 

GAT GCA GTG ATC ACC GTG CCA GCC TTT TTC AAC CAG GCC GAG CGC CGA 760 

Asp Ala Val He Thr Val Pro Ala Phe Phe Asn Gin Ala Glu Arg Arg 

175 180 185 

GCT GTG CTG CAG GCT GCT CGT ATG GCT GGC CTC AAG GTG CTG CAG CTC 808 

45 Ala Val Leu Gin Ala Ala Arg Met Ala Gly Leu Lys Val Leu Gin Leu 

190 195 200 

* 

ATC AAT GAC AAC ACT GCC ACA GCC CTC AGC TAT GGT GTC TTC CGC CGG 856 

He Asn Asp Asn Thr Ala Thr Ala Leu Ser Tyr Gly Val Phe Arg Arg 
205 210 215 



AAA GAT ATC AAT TCC ACT GCA CAG AAT ATC ATG TTC TAT GAC ATG GGC 904 
Lys Asp He Asn Ser Thr Ala Gin Asn He Met Phe Tyr Asp Met Gly 
220 225 230 
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TCG GGC AGC ACT GTG TGT ACC ATC GTG ACC TAC CAA ACG GTG AAG AC1 932 
Ser Gly Ser Thr Val Cys Thr lie Val Thr Tyr Gin Thr Val Lys Thr 
235 240 245 250 

5 

AAG GAG GCT GGG ACG CAG CCA CAG CTA CAG ATC CGG GGC GTG GGA TTT 1000 

Lys Glu Ala Gly Thr Gin Pro Gin Leu Gin lie Arg Gly Val Gly Phe 

255 260 265 

GAC CGC ACC CTG GGT GGC CTG GAG ATG GAG CTT CGG CTG CGA GAG CAC 1048 
10 Asp Arg Thr Leu Gly Gly Leu Glu Met Glu Leu Arg Leu Arg Glu His 

270 275 280 

CTG GCT AAG CTC TTC AAT GAG CAG CGC AAG GGC CAG AAA GCC AAG GAT 1096 
Leu Ala Lys Leu Phe Asn Glu Gin Arg Lys Gly Gin Lys Ala Lys Asp 
285 290 295 

15 

GTT CGG GAA AAC CCC CGA GCC ATG GCC AAA CTG CTT CGG GAA GCC AAT 1144 
Val Arg Glu Asn Pro Arg Ala Met Ala Lys Leu Leu Arg Glu Ala Asn 
300 305 310 

CGG CTT AAA ACC GTC CTG AGT GCC AAT GCT GAT CAC ATG GCA CAG ATT 1192 
Arg Leu Lys Thr Val Leu Ser Ala Asn Ala Asp His Met Ala Gin lie 
315 320 325 330 

GAA GGC TTG ATG GAC GAT GTG GAC TTC AAG GCA AAA GTA ACT CGA GTG 1240 
Glu Gly Leu Met Asp Asp Val Asp Phe Lys Ala Lys Val Thr Arg Val 

335 340 345 

GAG TTT GAG GAG CTG TGT GCA GAT TTG TTT GAT CGA GTG CCT GGG CCT 1288 
Glu Phe Glu Glu Leu Cys Ala Asp Leu Phe Asp Arg Val Pro Gly Pro 

350 355 360 

30 GTA CAG CAG GCC CTG CAG AGT GCT GAG ATG AGC CTG GAT CAA ATT GAG 1336 

Val Gin Gin Ala Leu Gin Ser Ala Glu Met Ser Leu Asp Gin lie Glu 
365 370 375 

CAG GTG ATC CTG GTG GGT GGG CCC ACT CGT GTT CCC AAA GTT CAA GAG 1384 
Gin Val He Leu Val Gly Gly Pro Thr Arg Val Pro Lys Val Gin Glu 
35 380 385 390 

GTG CTG CTG AAG CCT GTG GGC AAG GAG GAA CTA GGA AAG AAC ATC AAT 1432 
Val Leu Leu Lys Pro Val Gly Lys Glu Glu Leu Gly Lys Asn lie Asn 
395 400 405 410 

40 

GCC GAT GAA GCA GCT GCC ATG GGG GCC GTG TAC CAG GCA GCG GCA CTG 1480 
Ala Asp Glu Ala Ala Ala Met Gly Ala Val Tyr Gin Ala Ala Ala Leu 

415 " 420 425 

AGC AAA GCC TTC AAA GTG AAG CCA TTT GTT GTG CGT GAT GCT GTT ATT 1528 
45 Ser Lys Ala Phe Lys Val Lys Pro Phe Val Val Arg Asp Ala Val He 

430 435 440 

TAC CCC ATC CTG GTG GAG TTC ACA AGG GAG GTG GAG GAG GAG CCT GGG 1576 
Tyr Pro He Leu Val Glu Phe Thr Arg Glu Val Glu Glu Glu Pro Gly 
445 450 455 



50 



CTT CGA AGC CTG AAG CAC AAT AAA CGT GTG CTC TTC TCC CGA ATG GGG 1624 
Leu Arg Ser Leu Lys His Asn Lys Arg Val Leu Phe Ser Arg Met Gly 
460 465 470 
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CCC TAC CCT CAG CGC AAA GTC ATC ACC TTT AAC CGA TAC AGC CAT GAT 16 72 

Pro Tyr Pro Gin Arg Lys Val lie Thr Phe Asn Arg Tyr Ser His Asp 
475 480 485 490 

5 

TTC AAC TTT CAC ATC AAC TAC GGT GAC CTG GGC TTC CTG GGG CCT GAG 1720 

Phe Asn Phe His He Asn Tyr Gly Asp Leu Gly Phe Leu Gly Pro Glu 

495 500 505 

GAT CTT CGG GTA TTT GGC TCC CAG AAT CTG ACC ACA GTG AAA CTA AAA 1768 

10 Asp Leu Arg Val Phe Gly Ser Gin Asn Leu Thr Thr Val Lys Leu Lys 

510 515 520 

GGT GTG GGA GAG AGC TTC AAG AAA TAT CCT GAC TAT GAG TCC AAA GGC 1816 

Gly Val Gly Glu Ser Phe Lys Lys Tyr Pro Asp Tyr Glu Ser Lys Gly 
525 530 535 
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50 



ATC AAG GCC CAC TTT AAC CTA GAC GAG AGT GGA GTG CTC AGT TTA GAC 1864 

He Lys Ala His Phe Asn Leu Asp Glu Ser Gly Val Leu Ser Leu Asp 
540 545 550 

AGG GTG GAG TCC GTA TTC GAG ACC CTG GTG GAG GAC AGC CCA GAG GAA 1912 

Arg Val Glu Ser Val Phe Glu Thr Leu Val Glu Asp Ser Pro Glu Glu 

560 565 570 



GAG TCT ACT CTT ACC AAA CTT GGC AAC ACC ATT TCC AGC CTG TTT GGC 1960 

Glu Ser Thr Leu Thr Lys Leu Gly Asn Thr He Ser Ser Leu Phe Gly 
25 575 580 585 

GGT GGT ACC TCA TCA GAT GCC AAA GAG AAT GGT ACT GAT GCT GTA CAG 2008 
Gly Gly Thr Ser Ser Asp Ala Lys Glu Asn Gly Thr Asp Ala Val Gin 

590 595 600 

30 GAG GAG GAG GAG AGC CCT GCT GAG GGG AGC AAG GAT GAG CCT GCA GAA 2056 

Glu Glu Glu Glu Ser Pro Ala Glu Gly Ser Lys Asp Glu Pro Ala Glu 
60S 610 615 



CAG GGG GAA CTC AAG GAG GAA GCT GAA GCC CCA ATG GAG GAT ACC TCC 2104 
Gin Gly Glu Leu Lys Glu Glu Ala Glu Ala Pro Met Glu Asp Thr Ser 
620 625 630 

CAG CCT CCA CCC TCT GAG CCT AAG GGG GAT GCA GCC CGT GAG GGA GAA 2152 
Gin Pro Pro Pro Ser Glu Pro Lys Gly Asp Ala Ala Arg Glu Gly Glu 
635 640 645 650 

ACA CCT GAT GAA AAA GAA AGT GGG GAC AAG TCT GAG GCC CAG AAG CCC 2200 
Thr Pro Asp Glu Lys Glu Ser Gly Asp Lys Ser Glu Ala Gin Lys Pro 

655 " 660 665 

AAT GAG AAG GGG CAG GCA GGG CCT GAG GGT GTC CCT CCA GCT CCC GAG 2248 
45 Asn Glu Lys Gly Gin Ala Gly Pro Glu Gly Val Pro Pro Ala Pro Glu 

670 675 680 

GAA GAA AAA AAG CAG AAA CCT GCC CGG AAG CAG AAA ATG GTG GAG GAG 2296 
Glu Glu Lys Lys Gin Lys Pro Ala Arg Lys Gin Lys Met Val Glu Glu 
685 690 695 



ATA GGT GTG GAA CTG GCT GTC TTG GAC CTG CCA GAC TTG CCA GAG GAT 2344 
He Gly Val Glu Leu Ala Val Leu Asp Leu Pro Asp Leu Pro Glu Asp 
700 70S 710 
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GAG CTG GCC CAT TCC GTG CAG AAA CTT GAG GAC TTG ACC CTG CGA GAC 2392 
Glu Leu Ala His Ser Val Gin Lys Leu Glu Asp Leu Thr Leu Arg Asp 
715 720 725 730 

CTT GAA AAG CAG GAG AGG GAG AAA GCT GCC AAC AGC TTA GAA GCT TTT 2440 
Leu Glu Lys Gin Glu Arg Glu Lys Ala Ala Asn Ser Leu Glu Ala Phe 

735 740 745 

ATC TTT GAG ACC CAG GAC AAA CTG TAC CAA CCT GAG TAC CAG GAA GTG 2488 
lie Phe Glu Thr Gin Asp Lys Leu Tyr Gin Pro Glu Tyr Gin Glu val 

750 755 760 

TCC ACT GAG GAA CAA CGG GAG GAG ATC TCT GGA AAA CTC AGT GCC ACT 2536 
Ser Thr Glu Glu Gin Arg Glu Glu lie Ser Gly Lys Leu Ser Ala Thr 
765 770 775 

TCT ACC TGG CTG GAG GAT GAG GGA TTT GGA GCC ACC ACT GTG ATG TTG 2584 
Ser Thr Trp Leu Glu Asp Glu Gly Phe Gly Ala Thr Thr Val Met Leu 
780 785 790 

AAG GAC AAG CTG GCT GAG CTG AGA AAG CTG TGC CAA GGG CTG TTT TTT 2632 
Lys Asp Lys Leu Ala Glu Leu Arg Lys Leu Cys Gin Gly Leu Phe Phe 
795 BOO 805 810 

CGG GTG GAA GAG CGC AGG AAA TGG CCA GAG CGG CTT TCA GCT CTG GAT 2680 

Arg Val Glu Glu Arg Arg Lys Trp Pro Glu Arg Leu Ser Ala Leu Asp 

815 820 825 

AAT CTC CTC AAT CAC TCC AGC ATT TTC CTC AAG GGT GCC CGA CTC ATC 2728 
Asn Leu Leu Asn His Ser Ser lie Phe Leu Lys Gly Ala Arg Leu lie 

830 835 840 

CCA GAG ATG GAC CAG ATC TTC ACT GAC GTG GAG ATG ACA ACG TTG GAG 2776 
Pro Glu Met Asp Gin He Phe Thr Asp Val Glu Met Thr Thr Leu Glu 
845 850 855 



AAA GTC ATC AAT GAC ACC TGG ACC TGG AAG AAT GGA ACC CTG GCC GAG 2824 
Lys Val lie Asn Asp Thr Trp Thr Trp Lys Asn Ala Thr Leu Ala Glu 
35 860 865 870 

CAG GCC AAG CTT CCT GCC ACA GAG AAA CCC GTG CTG CTT TCA AAA GAC 2872 
Gin Ala Lys Leu Pro Ala Thr Glu Lys Pro Val Leu Leu Ser Lys Asp 
875 " 880 885 890 



ATC GAG GCC AAA ATG ATG GCC CTG GAC CGG GAG GTG CAG TAT CTA CTC 2920 
He Glu Ala Lys Met Met Ala Leu Asp Arg Glu Val Gin Tyr Leu Leu 

895 900 90S 

AAT AAG GCC AAG TTT ACT AAA CCC CGG CCA CGG CCC AAG GAC AAG AAT 2968 
Asn Lys Ala Lys Phe Thr Lys Pro Arg Pro Arg Pro Lys Asp Lys Asn 

910 915 920 

GGC ACC CGG ACA GAG CCT CCC CTC AAT GCC AGT GCT GGT GAC CAA GAG 3016 
Gly Thr Arg Thr Glu Pro Pro Leu Asn Ala Ser Ala Gly Asp Gin Glu 
925 930 935 

GAA AAG GTC ATT CCA CCT ACA GGC CAG ACT GAA GAG GCG AAG GCC ATC 3064 
Glu Lys val He Pro Pro Thr Gly Gin Thr Glu Glu Ala Lys Ala He 
940 945 950 
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TTA GAA CCT GAC AAA GAA GGG CTT GGT ACA GAG GCA GCA GAC TCT GAG 3112 

Leu Glu Pro Asp Lys Glu Gly Leu Gly Thr Glu Ala Ala Asp Ser Glu 
955 960 965 970 

5 

CCT CTG GAA TTA GGA GGT CCT GGT GCA GAA TCT GAA GAG GCA GAG GAG 316 0 

Pro Leu Glu Leu Gly Gly Pro Gly Ala Glu Ser Glu Gin Ala Glu Gin 

975 980 985 

ACA GCA GGG CAG AAG CGG CCT TTG AAG AAT GAT GAG CTG TGACCCCGCG 3209 
fQ Thr Ala Gly Gin Lys Arg Pro Leu Lys Asn Asp Glu Leu 

990 995 

CCTCCGCTCC ACTTGCCTCC AGCCCCTTCT CCTACCACCT CTA 3252 



75 (2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE : amino acid 

(C) STRAND EDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



25 
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35 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Leu Ala Val Met Ser Val Asp Leu Gly Ser Glu Ser Met Lys Val Ala 
15 10 15 

lie Val Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu 

20 25 30 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc «■ "synthetic nucleic acid* 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AATACGACTC ACTATAGGGA 20 
(2) INFORMATION FOR SEQ ID NO: 7: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNBSS : single 



55 



BNSOOCID <EP 0780472A2 I > 



43 



EP 0 780 472 A2 



10 



15 



20 



30 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Lys Pro Gly Val Pro Met Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : other nucleic acid 

(A) DESCRIPTION: /desc = "synthetic nucleic acid" 



(ix) FEATURE: 

(A) NAME/KEY: - 
25 (B) LOCATION: 6 

(D) OTHER INFORMATION: /not e= "N at position 6 is an 
inosine residue. " 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION : 9 

(D) OTHER INFORMATION : / note- "N at position 9 is an 
inosine residue. ■ 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

AARCCNGGNG TNCCNATGGA 20 
(2) INFORMATION FOR SEQ ID NO: 9: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: rinear 

45 (ii) MOLECULE TYPE: peptide 



50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Lys Pro Gly Val Pro Met Glu lie Val Leu Asn Lys Glu 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc » "synthetic nucleic acid" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GCACCCTTGA GGAAAATGCT 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc » "synthetic nucleic acid" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCCAGAAGCC CAATGAGAAG 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2861 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GAAAGAAGTA GACATGGGAG ACTTCATTTT GTTCTGTACT AAGAAAAATT CTTCTGCCTT 
GGGATGCTGT TGATCTATGA CCTTACCCCC AACCCTGTGC TCTCTGAAAC ATGTGCTGTG 
TCCACTCAGG GTTAAATGGA TTAAGGGCGG TGCAAGATGT GCTTTGTTAA ACAGATGCTT 
GAAGGCAGCA TGCTCGTTAG GAGTCATCAC CACTCCCTAA TCTCAAGTAC CCAGGGACAC 
AAACACTGCG GAAGGCCACA GGGTCCTCTG CCTAGGAAAG CCAGAGACCT TTGTTCACTT 
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GTTTATCTGC TGACCTTCCC TCCACTATTG TCCTATGACC CTGCCAAATC CCCCTCTCSCC 360 

AGAAACACCC AAGAATGATC AATAAAAAAA AAAAAAAAAA AAAAAGGAAG AATAGACTCT 420 

CTCTGGGACT GCCAATAATT TTTCCTTCTA AGCATAGACA CCGGACCACT CTCCACCTAA 480 

GCATCACGAA AAATGTAGAG AAAGGAAGAG CTAAGAGCTC CTTAAACAAG TTCAGGCTTG 540 

ACACAACCCT GGCCCTGACA GCCAGGGTCT TCAAGCGGGC CTTTCTGTGA AGGGTGGCCA 600 

GGCATCAACT TAGTAGGAGA GAAAACAGAT GACTTATTTC CATCCACACT TAAGGAAAAT 660 

GCAGTCTCCA AGGACTGCGT AC AT TTCTTT TTCGAGAAGG AGTCTCGCTG TTGTCGCCCA 720 

GGCTGGAGTG CAGTGGCGCA GTCTGGGCTC ACAGCAACCT CTGCCTCCCG GATTCAAGCA 780 

ATTCTCCTGC CTCAGCCTCG TGAGTAGCTG GGATTACAGG CACCCGCCAC CACGCCTGGC 840 

TAATTTTTGT AGTTTTGGTA GAGACGGGGT TTCACCATGT TGGCCAGGCT GGTCTCGAAC 900 

TCCTGACCTC CAGTGATTCG CCCGCCTTGG CCTCCCAAAA TGCTGGGATT ACAGGCGTGA 960 

GCCACCGCGC CCGGGCGACT GCGCACATTT CTATGGAGCT GTAAGTTAAA AGAGAAGGCA 1020 

GTGAGGTGCT TCTGTCATTC TATGACAGAA ACAGCTAAAG AGTAGAGAAA TGTTCACAAG 1080 

ATTTAATAGA ACAGAAATAG GAGAAGGTGC ACACAAGCTC AACCAACTAT AGCCTCACAA 1140 

ATAAAAGTGT CTTTTGTGTG TAGTACTTAA GTTTGGAATA TTCTTTCTTA TACAAATGAG 1200 

TGGGGCTTAA CCTAAGAAAT CCTGGCCAGA TTCTGCGACG AATG CATCGG TTATCTC TGA 1260 

CCCATCAGCA AACATCTTTT TCTGTGGCTT CAGTTTCCTC AGTAAAACAG AGGGGGTTGC 1320 

GACGGACTCA GTCCGAGGCA CAGCCATTCT CCAACGTCTA TCCAAAGCCT AGGGCACCTC 1380 

AATACTAACC GGCAGGCCAG CGCCCCCTCC GCGGGGCTGC GGACAGGACG CCTGTTATTC 1440 

CATTCCTCGG CCGGGCTCTA CAGGTGACCG GAAGAAGAGC CCCGAGTGCG GGACTGCAGT 1500 

GCGCCCGACC TGCTCTAGGC GCAGGTCACT CCCGAACCCC GGCAGCAAAG CATCCAGCGC 1560 

CGGAAAAGGT CCCGCGGTCG CCCCGGGGCC GGCGCTGGGG AGGAAGGAGT GGAGCGCGCT 1620 

GGCCCCGTGA CGTGGTCCAA TCCCAGGCCG ACGCCGGCTG CTTCTGCCCA ACCGGTGGCT 1680 

GGTCCCCTCC GCCGCCCCCA TTACAAGGCT GGCAAAGGGA GGGGGCGGGG CCTGGGACGT 1740 

GGTCCAATGA GTACGCGCGC CGGGGCGGCG GGGGCGGGGC CGGGCGCGCA GCGCAGGGCC 1800 

GGGCGGCCGA GGCTCCAATG AGCGCCCGCC GCGTCCGGGG CCGGCTGGTG CGCGAGACGC 1860 

CGCCGAGAGG TTGGTGGCTA ATGTAACAGT TTGCAAACCG AGAGGAGTTG TGAAGGGCGC 1920 

GGGTGGGGGG CGCTGCCGGC CTCGTGGGTA CGTTCGTGCC GCGTCTGTCC CAGAGCTGGG 1980 

GC CGCAGG AG CGGAGGCAAG AGGTAGCGGG GGTGGATGGA GGTGCGGGCC GGCCACCCCT 2040 

CCTAGGGGAG ACAGCGTGCG AGCTCCGGGG GCGGGTCGGG AGCGCAAGGG AGGGCCGCGC 2100 
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GGACGCCCGG 


CGCTCGGCCT 


CGCACCGGGG 


GGCACGCAGC 


TCGGCCCCCG 


GTCTGTCCCC 


2160 


ACTTGCTGGG 


GCGGGCCGGG 


ATCCGTTTCC 


GGGAGTGGGA 


GCCGCCGCCT 


TCGTCAGGTG 


2220 


GOGTTTAGGT 


GAACACCGGG 


TAACGGCTAC 


CCGCCGGGCG 


GGGAACCTTA 


CCGCCCCTGG 


2280 


CACTGCGTCT 


GTGGGCACAG 


CGGGGC CGGG 


GAGTGAGCTG 


GGAAAGGGGA 


GGGGGCGGGA 


2340 


CAACCCGCAG 


GGATGCCGAG 


GAGGAGATAG 


GCCTTTCCTT 


CATCCTAGCT 


ACCCCCAACG 


2400 


TCATTACCTT 


TCTCTTCCCG 


TCCAGGCCCA 


GCTGGCTTTC 


CCCGTCAGCG 


GGGGAGCTCC 


2460 


AGCTCTGGGG 


AGGTGG TTG A 


GCCCTGGGCG 


GGGATCCCTG 


GCCGCACCCC 


AGGTGTCTGA 


2520 


CAACAGGCAC 


AGTGCTGCGG 


TGCGCCACTC 


ACTGCCTGTG 


TGGTGGACAA 


AAGGCTCGGG 


2580 


TCTCcrrrcr 


CTTGTCCTGT 


TAGCTTCTCT 


GTTTAGGGAT 


GTGGCAAAGC 


CGAGGACCCA 


2640 


TGCTCTTTCA 


CTTGGGCCTT 


TGTGTGGGCG 


CTGCTGGGAT 


GATTAGAGAA 


TGGTTTGTAC 


2700 


CCATCAGGAG 


GGAGAAGGGG 


AGAAGTAGGC 


TGATCTGCCC 


TGGGTAAGAA 


TGAAGTAGAT 


2760 


ATGAATCTTA 


CAGCCTCTCC 


GTTCTGGGAT 


GTGATTCTGT 


CTCCTTCACT 


CCGGGTATCC 


2820 


AGTTTTAAGT 


GTTTTCTTTC 


TTCGCCTCCC 


CCAGGGGCAC 


T 




286X 



Claims 

1 . A polynucleotide encoding an 0RP1 50 polypeptide selected from the group consisting of: 

(a) polynucleotides encoding the polypeptide having the amino acid sequence as depicted in SEQ ID NO:1 or 
a fragment of the polypeptide; 

(b) polynucleotides comprising the coding region of the nucleotide sequence as shown in SEQ ID NO:2 or a 
fragment thereof; 

(c) polynucleotides encoding the polypeptide having the amino acid sequence as depicted in SEQ ID NO:3 or 
a fragment of the polypeptide; 

(d) polynucleotides comprising the coding region of the nucleotide sequence as depicted in SEQ ID N0:4 or a 
fragment thereof; 

(e) polynucleotides encoding an ORP150 polypeptide which differs from the polypeptide encoded by the poly- 
nucleotide of (a) or (c) due to deletion(s), addition(s), insertion(s) and/or substitutions^) of one or more amino 
acid residues; and 

(f) polynucleotides the complementary strand of which hybridizes to a polynucleotide of any one of (a) to (e) 
and which encode an ORP150 polypeptide; 

and the complementary strand of such a polynucleotide. 

2. The polynucleotide of claim 1 which is DNA. 

3. The polynucleotide of claim 2 which is genomic DNA. ' 

4. The polynucleotide of claim 1 which is RNA. 

5. A vector comprising the polynucleotide of any one of claims 1 to 4. 
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6- The vector of claim 5, in which the polynucleotide is operatively linked to regulatory elements which allow for 
expression in prokaryotic or eukaryotic host cells. 

7. A host cell transformed and genetically engineered with a polynucleotide of any one of claims 1 to 4 or with a vector 
5 of claim 5 or 6. 

8. A process for the preparation of an ORP150 polypeptide comprising culturing the host ceil of claim 7 and recover- 
ing the polypeptide from the cells and/or the culture medium. 

10 9. A polypeptide encoded by the polynucleotide of any one of claims 1 to 4 or obtainable by the process of claim 8. 

1 a An antibody or fragment thereof which specifically recognizes the polypeptide of claim 9. 

1 1. A nucleic acid molecule which specifically hybridizes to a polynucleotide of any one of claims 1 to 4. 

15 

1 2. A pharmaceutical composition comprising a polynucleotide of any one of claims 1 to 4, the polypeptide of claim 9, 
the antibody of claim 10 and/or the nucleic acid molecule of claim 1 1 and optionally a pharmaceutical^ acceptable 
carrier. 

20 1 3. A diagnostic composition comprising a polynucleotide of any one of claims 1 to 4, the polypeptide of claim 9, the 
antibody of claim 10 and/or the nucleic acid molecule of claim 1 1 . 

14. Use of the polynucleotide of any one of claims 1 to 4, the polypeptide of claim 9, the antibody of claim 10 or the 
nucleic acid molecule of claim 1 1 for the preparation of a pharmaceutical composition for the treatment of ischemic 

25 diseases. 

1 5. A nucleic acid molecule having promoter activity and being able to promote transcription in cells when exposed to 
hypoxia selected from the group consisting of: 

30 (a) polynucleotides comprising the nucleotide sequence as depicted in SEQ ID NO: 12 or a fragment thereof; 

and 

(b) polynucleotides hybridizing with the polynucleotide of (a). 
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1. Claims: 1-14 partially, and 15. 

A human hypoxia- inducible protein of approx. 150 kDa, DNA 
encoding it, vector comprising said DNA, host cell 
transformed with said vector, process for preparation of the 
peptide by expression in said host, an antibody or fragment 
thereof against the peptide, a nucleic acid hybridizing to 
said DNA, and pharmaceutical or diagnostic preparations 
comprizing the DNA, peptide, antibody or hybridizing nucleic 
acid. Also an hypoxia-inducible promoter sequence. 



2. Claims: 1-14 partially 

A rat hypoxia-inducible protein of approx. 150 kDa, DNA 
encoding it, vector comprising said DNA, host cell 
transformed with said vector, process for preparation of the 
peptide by expression in said host, an antibody or fragment 
thereof against the peptide, a nucleic acid hybridizing to 
said DNA, and pharmaceutical or diagnostic preparations 
comprizing the DNA, peptide, antibody or hybridizing nucleic 
acid. 
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